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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods: 

■2. BACJsGROtJND. ' 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization -cloning and expression cloning tecimiques clone novel ' 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 
"indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as , ^ 
various PCR-based or low stringency hybridization^based cloning techniques, have advanced:the 
state of the art by making available large ntanbers of DNA/amino acid sequences for proteins 
that.are known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of faoiown biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in,,for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

. and products dependent on DNA and amino acid sequences. 

3. SUMiVIARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides,including recombinant DNA molecules, 
cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
polynucleotidesand cells genetically engineered to express such polynucleotides. . . 
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The present invention relates to a.coUection or IflDrary of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to lie proteins encoded by such polynucleotides, along with therapeutic, 
5 • diagnostic and research utilities for these polynucleotides and proteins. ITiese nucleic acid 
. sequencesaredesignatedasSEQIDNO: 1-984, 1969-2952,3937-3942or3949-3954. The 
. polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
. 3960. The nucleic acids and polypeptides are provided in the Sequence Listmg. In the nucleic acids 
proyided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N . 
10 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the . 
stopcodoiL 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
; stringent hybridization conditions; nucleic acid sequences which are alleHc variants or ^ecies 
15 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 

encode a peptide comprising a specific doniain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1-984, 1969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof The identifying sequence can 
20 be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information • 
fromthenucleicacidsequencesofSEQ ID NO:l-984, 1969-2952,3937-3942or 3949-3954. The 
sequence infonnation can be a segment of any one of SEQ ID NO: 1-984, 1969-2952, 3 937-3 942 or 
3949-3954 that uniquely identifies or represents the sequence infomiation of SEQ ID NO:l-984, 
25 1 969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence inforniation or identifying iirformation of each sequence can be provided on 
a nucleic acid array. In one enlbodiment, segments of sequence infomiation is provided oh a 
nucleic acid array to detect the polynucleotidethat contains the segment. The anray can be designed 
30 . to detect fiill-match or mismatch to the polynucleotide that contains the segment. The collection . 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
• - 3 5 reverse or direct complements) according to the invention have numerous applications in a variety 
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oftechniqueskno^^^ltothoseskiUedmthea^tofmolecda^biologyJSuchasuseashybridi^ 
probes, use as primers for PGR, use in an array, use ia computer-readablemedia, use in sequencing 
fiiU-length genes, use for chromosorne and gene inapping, use in the recornbinant production of 
protein, and use in the generation of anti-sense DNA or KNA, their chemical analogs and the like. 
5 Inapreferredemb6diment,thenucleicacidsequencesof SEQ ID NO:l-984, 1969-2952, 

' 3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as . 

primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
/ ■thenucleicacidsequencesofSEQIDNO:l-984,1969-2952,3937-3942or3949-3954ornovel 
segments or parts of the nucleic acids provided herein are iised in diagnostics for identifying 
, 10 expressed genes or, as well known in the art and exemplified by VoUrathet al.. Science 258:52-59 
(1 992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
. polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -984, 
: 1969-2952, 3937-3942 or 3949-3 954; a polynucleotide comprising any of the fuU length protein 
• 15 coding sequences of SEQ E) NO: 1-984, 1969-2952,3937-3942 or 3949-3954; and a polynucleotide 
' comprising any ofthe nucleotide sequences oftheroature protein coding sequences of SEQ ID 
.NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The polynucleotides of the present invention also 
. include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization . 
conditionsto (a) the complementof any one ofthe nucleotidesequencessetfprthinSEQ.IDNO:!- 
. 20': 984, 1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
ainino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) apolynucleotide that encodes a 
. polypeptide comprising a specific doniain or truncation of any. of the polypeptides comprising an 
. 25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not litoited to, a polypq>tide 
comprising any ofthe ainino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960; or the corresponding fuU length or mature protein. Polypeptides ofthe 
invention also include polypeptides with biolo gical activity that are encoded by (a) any of the 
30 polynucleotideshaving a nucleotide sequence set forthin SEQ ID NO:l-984, 1969-2952,3937- . 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement ofthe polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any ofthe polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85.%, 90%, 95%, 98% or 99% amino acid sequence 
35 identity) that preferably retain biological activity are also contemplated. The polypeptides of the 

■ ■ 3- 
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. invention may be wholly or partially chemically synthesized but axe preferably produced by 
. recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to inethods for producing a polypeptide of tiie invention 
' comprising growing a culture of the host cells of the invention in a suitable culture medium 
\Q under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
. . from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. . 

Polynucleotides according to the invention' have numerous applications in a variety of . . 
techniques known to those skilled in the art of molecular biology. These techniques include use 
15 as hybridization probes, use as oligomers, or primers, for PGR, use for chromosome and gene, 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DKA 
. or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides , of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
20 using, e. J'., z« jzYu hybridization. 

. In other exernplary embodirnents, the polynucleotides are used in diagnostics as. 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al.. Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 
25 The polypeptides according to the invention caii be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
3.0 markers, and as a food supplement 

Methods are also provided for preventing, treating, or aiheliorating a medical condition 
which comprises the step of administering to a mammaUan subject a therapeutically effective 
' amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. ■ 
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In particular, lie polypeptides and polynucleotides of the invention can be utilized, ^ 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein ' 
expression or biological activity. 

The present invention fijrther relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention La a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides, of the invention in a sample, comprising , contacting 
. the sample with a compound that binds to and forms a complex with. the polynucleotide of . ■ 
.10 interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
^ invention in. a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
.15 and detecting the forma.tion of the complex such that if a complex is formed, the polypeptide is 
■ . detected. ^ • ' • . .. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
. .antibodies, and optionally quantitative- standards, for .carrying out methods of the invention. 
Furthermore, the invention provides methods for. evaluating the efficacy p 

, . - 20 monitoring the progress of patients, involved in clinical trials for the treatment of disorders, as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. . Such methods can be utilized, for example, for the identification of compounds 
25 that can amehprate symptoms of disorders as recited herein! Such methods can include, but are . 
not limited to, assays for identifying compomds and other substances that interact with (e.^., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compoimd 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
'30 complex, wherein the complex drives expression of a rep.orter gene sequence in the cell; and 

detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
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symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 

, disorders as recited herein comprising adininistering compounds, and other substances that 
modulate the overall activity of the target gene products. Compounds and other s;ubstances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding theih are also 
useful for the same fimctions known to one of sldll in the art as the polypeptides and . 
- polynucleotides to which they have homology (set forth in Tables 2 and 9); for which'they have 
a signature region (as set forth in Tables 3 and 10); or for. which they have hortiology to a gene 
10 family (as set forth in Tables 4 and 11). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are -useful for a variety of applications, 
as described herein, including use in arrays for detection. . 

4. DETAn..ED DESCRIPTION OF THE INVENTION 

■:.15:/. ■ / \ ■ , . . ■ 

4.1 DEFINITIONS . 

It must be noted that, as used herein and in the appended cldins, the singular forms "a", 
, and "the'Vinclude plural references, uiiless the context clearly dictates otherw 

The term Active" refere to those, formis of the polypeptide which retain the biologic 
20 and/or inmiunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical fimctions of a naturally occurring molecule. 
Likevwse "immunologically active" or "immunological activity" refers to the capability of the ' 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extraceUular or intracellular membrane trafQcking, including the export . of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
30 polynucleotides by base pairing. For example, the sequence 5 '-AGT-3' binds to the 

complementary sequence 3'-TCA-5'. Complementarity between two single-stranded molecules 
may be" "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
35 strength of the hybridization between the nucleic acid strands. 
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The term "embiyomc Stem ceUs (ES)" refers to a cell that can give rise to many , . 
differentiated cell types in an embryo or an adult, includiag the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these ceEs 
not only populate the germ line and give rise to a plurality of terminally differentiated cells fbat 
10 comprise the adult specialized organs j but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which , 
modulates.the expression of an operably linked ORP or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
■ sequence" when the expression of the sequence is altered by the presence.of the EMF. EMFs 
15 include, but are not limited to, promoters, and promoter modulating sequences (iaducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF m response to a specific regulatory factor or physiological event. . 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
20 sequence of these nucleotides. These phrases also refer, to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or KNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
25 provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of ohgonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elernents derived from a microbial or viral operon, or a eukaryotic gene. 
30 The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangea:bly and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less, than about 500 
35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
-nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PGR), various hybridization procedures or microairay 
procedures to identify or amplify identical or related parts of mKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:i-20, 

10 Probes may, for example, be used to detemaine whether specific naRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PGR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PGR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. etal., 1989, Molecular Cloning: A Laboratory Manual, Gold Spring Harbor 
Laboratory, NY; or Ausubel, P.M. et al., 1 989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 informationfromthenucleic acid sequencesofSEQIDNO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1-984, 
1 969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is flilly matched in tiie human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4^° possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fiiUy matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is fiilly matched in the expressed 
sequences is also approximately one in five because e)q}ressed sequences comprise less than 
approxirnately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. . The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a full match (l-f4^^) times the 
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increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in aa array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in. five. 
5 The term "open reading fiiame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to fiinctionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. "While operably 

10 linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously hnked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its. 

15 differentiation capability in comparison to a totipotent ceU. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fiiagment thereof and. to naturally occurring or 
synthetic molecules. A polypeptide "fi^gment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amiiio acids, more 

20 preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 150 amino acids and most preferably less than 
100 amino acids. Preferably the peptide is fi-om about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. 

25 The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, Upidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 

30 length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 

35 protein portion may or may not include the initial methionine residue. The methionine residue 
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may be removed from the protein during processing in the cell. Hie peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 xibiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted vwthout abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologotis peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Altematively, recombinant variants encoding these same or similar polypeptides may be 
s>'nthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particula: 

20 . prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, i. e. , conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or tbe amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Altematively, where, alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
caiij for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain afiinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
. eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more . 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anytiiing) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fiingal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially fi-ee of native endogenous substances and 
imaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be firee of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 

11 
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comprise a transcriptional xmit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include^a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is ejq)ressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a jSnal product. 

10 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systerns as . 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory, elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without lirnitation proteins secreted wholly 
(e.g., soluble proteins) or partially {e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
: Young, P.R (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1 998) Aimu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the poljqjeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-boumd DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0;1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
5 In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer .both to nucleotide and amino acid 

10 sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
3 5 % (/. e. , the number of individual residue substitutions, additions, and/or deletions in a 

15 substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g. , mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 

20 by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivzilent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80%> sequence identity with a listed 

25 amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence ideiitities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 

30 sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the purposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 

35 equivalent expression characteristics are considered substantially equivalent. For the purposes of 



wo 01/57190 PCTAJSO 1/04098 

deterrnining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between 
sequences can also be determined by other methods known in. the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a stiitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
15 using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid moleciile is then incubated 
with an appropriate host imder appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 



4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated poljTiucleotides of the invention include a poljoiucleotide comprising the 
nucleotide sequences of SEQ E) NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
5 interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domakis in immunoglobulin-like proteins include the variable 
icomunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
10 domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
- region of the cDNA. 

1 5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amphficatioa of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 
.20 be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
correspondsto any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridization conditions usiag any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

25 NO: 1-984, 1969-2952, 3937-3942or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained fi-om one or more public databases, such as 
30 dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative Segment or segment information, or novel segment information for the fiill-length 
gene. 

The polynucleotides of the invention also provide pol3aiucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
35 according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited^ 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize imder stringent conditions to any of the nucleotide sequences, 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 . 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
poljTiucleotide sequences in the same family of genes or can differentiate hxmian genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparingthe sequence provided SEQ ED NO: 1-984, 
1969-2952, 3937-3942 or 3 949-3 954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ED NO: 1-984, 1 969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the sarne anaino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids ofthe present invention, 

including SEQ ID NO: 1 -984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul S J. et al. J. Mol. Biol. 21 :403-410 (1990)). Altemativelya 

30 FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
. acid source from the desired species. 



16 



wo 01/57190 PCWSOl/04098 
The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are fiirther directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 

1 0 encoding the amino acid sequence variants are preferably constructed by mutating the 

polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices {e.g. , 

1 5 hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. . Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

.20 hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

25 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are weU known to 

30 those of skill in the art and this technique is exemplified by pubUcations such as, Edelman et al., 
. DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PGR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 

35 sHghtly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PGR amplification results in a population of product DNA fi-agments that 
differ fix)m the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino' acid variant 
5 A further technique for generating aniino acid variants is the cassette mutagenesis 

technique described in WeUs et al.. Gene 34:315 (1 985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantieilly the same or a functionally equivalent 

10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one. or more 

15 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polyriucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature - 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or fimctional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a fimctional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989)MolecularCloiung: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Usefiil 

30 nucleotide sequences for joining to polynucleotides include an assortmeiit of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well Icndwn in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of repUcation functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eiikaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present kivention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ED NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment tiiereof is inserted, in a forward or reverse 

1 0 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORE. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 

15 PsiX174, pBluescript SK, pBs KS; pNHSa, pNH16a, pNHlSa, pNH46a (Sti:atagene); pTrc99A, 
pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaiyotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stiratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufinan et al., 

20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exempliQed in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 

25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
, (transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 

30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombioant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli- 

35 and S. cerevisiae TRPl gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock. proteins, among others. The heterologous structural sequence is 
. assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a . 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Usefiil expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals ki operable reading phase with a functional promoter. The vector will comprise one. or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include coli. Bacillus subtilis. Salmonella iyphimurium and various species 

15 wittun the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223 -3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 . appropriate means {e.g. , temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifiagation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
- example, as described in Fan et aL, Nat. Biotech. 17:^70-^72 (1999), incorporated herein by 

30 reference, nucleic acid seiquences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombioant expression vector and may be in the form of 
nakedDNA. - 



20 



wo 01/57190 , PCT/USO 1/04098 

43ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949^3954, or fragments, analogs or 
5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complerhentary to a "sense" nucleic acid encoding a proteia, e.g. , complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 

derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ED NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 

15 of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the codihg strand of a niicleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 

20 translated into arnino acids (/.e., also referred to as 5' and 3' untranslated regions). . 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954), antisense nucleic, acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 

25 preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense ohgonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical sjmthesis or en2ymatic ligation reactions 

30 using procedures known in the art. For example, an antisense nucleic acid (e.g. , an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouraci], 5-chiorouracil, 5-iodouraciI, hypoxanthine, xanthine, 
4-acetyIcytosine, 5-(carboxyhydroxylmethyl) uracil, S-carboxymethylaminomethyl- 
2-thiouridiQe, 5-carboxymethyIaminomethyIuracil, dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytostne, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaininomethyl-2-thioirracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxjairacil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, iiracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/. e. , RNA transcribed from the 

15 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically adiniriistered to a 
subject or generated in situ such that they hybridize with or bind to cellular mENA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
.20 protein, e.^., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double heUx. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 . antisense nucleic acid molecules can be modified to . target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol n or pol III promoter are preferred. ' 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid moleciile. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual P-imits, the 
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Strands run parallel to each other (Gaultier e/ al. (1987) Nucleic Acids ReslS: 6625-6641). The 

antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 

(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 

FEBS Lett 215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
■single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

10 Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO: 1- 
984, 1 969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

15 . rVS RNA can be constructed in which the nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987^071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity firom 
. a pool of RNA molecules. See, e.g., Bartel ef a/., (1993) Science 261:1411-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g.^ promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N. Y. Acad Sci. 660:27-36; and 
Maker (1992) Bioassays 14:807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al. (1 996) Bioorg Med 
Chem 4: 5-23); As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 

30 mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 

35 Periy-O'Keefe et al. (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene e}q>ression by, e.g. , inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 
5 gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
. combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-CKeefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
, combine the advantageous properties of PNA and DNA. Such chinieras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with tlie DNA portion while the PNA 

1 5 portion would provide high binding affmity and specificity. PNA-DNA chimeras can be'linked 
using Unkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn e/a/. (1996) Nud Acids Res 24:. 
3357-63. For example, a DNA clmin can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e;g., 

5'-(4-methoxytrityI)amiao-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al. (1989) Nud Add Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al. (1 996) above). Alternatively, chimeric molecules can be synthesized 

25 vnth a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Chem 
Zeff 5: 1119-11124. 

In other embodiments, the oKgonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
30 Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. WD89/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g. , Krol et 
al., 1988, BioTechnigues 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Fharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridi2ation-triggered 

cleavage agent, etc. 



4.5 HOSTS 

5 The present invention further pJrovides host cells genetically engineered to contain the 

polynucleotides of the inventiort . For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using knovm transformation, transfection or infection 
methods. The present invention still ftirther provides host cells genetically engineered to express 
. the polynucleotides of the invention, wherein such poljoiucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the pol5npeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and.dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaiyotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated traxisfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 

30 poljoiucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 



25 



wo 01/57190 PCTAJSO 1/04098 

The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins iising 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammcdian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Glu2man, Cell 23 : 1 75 (1 98 1). Other ceO Unas capable of expressing a 
compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human' epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

15 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any. necessary ribosome binding sites, polyadenylation 
. site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, . aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast ^ 
30 . or insects or in prokaryotes such as bacteria. Potentially siiitable yeast strains include 

Sacchardmyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast sfrain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 



26 



wo 01/57190 PCX Also 1/04098 

glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues rtiay be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described hereia, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence sjTithesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

10 negative regiilatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, sphce 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

15 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the " 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
\ Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or dififerent cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In aU cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
. allowing for the selection of cells in which the exogenous DNA has integrated iiito the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 

recombination event with sequences in the host cell genome does not result in the stable . 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

Chappel; U.S. Patent No. 5,578,461 to Sherwiu et al.; International Application No. 

PCT/US92/09627 (WO93/09222) by Selden et al.; and International AppHcation No. 

5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 

herein in. its entirety, 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1 968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

15 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
. complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at.least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cycUzed using known.methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immimoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example, 

without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 

5 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the anuno acid sequence of the fuU-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are folly secreted from the cell in which they are expressed. 

10 Protein compositions of the preseij.t invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

15 nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

20 sequence can be synthesi2ed using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary . 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing smedl peptides and fragments of larger polypeptides. Fragments are usefiil, for 

25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immtmological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

30 cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or sjmthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
fbrther purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

15 and immuno-afSnity chromatography. See, e.g. , Scopes, Protein Purification: Principles and 
Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: ^ ZaZ)ora/ory 
Manual; Ausubel et al.. Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can bensed in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
25 activity in in vivo tissue cultui-e or animal models that are well known in the art. In briel^ the 
. , molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides, 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the nulk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 
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The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
dehberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino, acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion'are 
weU known to those skilled in the art (see, e.^., U.S. Pat. No. 4,518,584). Preferably, such 

1 0 . alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein fiinction can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting cdanine-containing variant for biological activity. This type of analysis determines the 

15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein fimction may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are usefiil for screening or. other inununological 
methodologies niay also be easily made by those skilled in the art given the disclosures herein. 

20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 

25 (the MaxBat™ kit), and such methods are well known in the art, as described in Simrmers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

30 culture conditions siiitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (z.e., from culture medium or cell extracts) using knoAvn 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an afOnity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
firom New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG<gi.") is commercially 
10 available firom Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
15 homogeneous isolated recombinant protein. The protein thus purified is substantially fi-ee of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. . 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the deHvery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 

antibodies to immxme cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CDS antibodies and 

30 ■ steroids. Also, polj^eptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINmG POLYPEPTroE AND POLYNUCIJEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al.. Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al.. Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et.al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (NeAdll- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

10 (Sonhhammer et al.. Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
firom the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 

15 BioL 215:403-410 (1990); 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteiiis. As used herein, a "chimeric 

protein" or "fusion priDtein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 

20 correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 

25 polypeptide are fused in-firame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. . 

For example, in one embodiment a fusion protein comprises a pol5^eptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fiision protein in which the polypeptide 

30 sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fiised to 
sequences derived from a member of the immimoglobulin protein family. The immunoglobulin 

35 fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to iohibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
ftision proteins can be used to affect the bioavailabihty of a cognate ligand. Inhibition of the 
ligand/protein mteraction may be useful therapeutically for both the treatment of proliferative 
5 and differentiative disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify Ugands, and in screbning assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fiision protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are hgated together ih-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for Ugation, restriction enzyme digestion to provide for 
appropriate termmi, filhng-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic Ugation. In another embodiment, the fusion gene can 

15 be synthesized by conventional techniques including automated DNA synthesizers. 

. Alternatively, PGR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can ' 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
afusionmoiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fiision moiety is Unked 
in-frame to the protein of the invention. 

25 4.8 GENETHERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a fimctional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA fransfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useM in treating the disease 
states. It is contemplated that antisense therapy or gene ±erapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

10 . Othermethodsinhibitingexpressionof a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific, 

1 5 The present invention still further provides cells genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host .cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present inventioiL 

20 Knowledge of DNA sequences provided by the invention allows for modification of cells to 

. permit, increase, or decrease, expression of endogenous polypeptiide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter, with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 

25 operatively linked to'the desired protein encoding sequences. See, for example, PCT International 

PublicationNo. WO 94/12650, PCT International PubUcationNo. WO 92/20808, and PCT 

■ ■ / . 

IntemationalPubUcationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, ampHfiable marker DNA (e.g., ada, dhfir, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 

30 intron DNA may be mserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

35 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene' s existing regulatory region with a regulatory sequence isolated from a different gene" 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scafifold-attachmentregions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory proteia binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mKNA stabihty elements, sphce sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

10 which alter or improve the fimction or stabOity of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
imder the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regvilatory element. Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements . Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNAhas integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is Unked to the exogenous 
DNA, but configiired such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
piirpose include the Herpes Simplex Virus thyrnidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S . Patent No. 5 ,272,07 1 to Chappel; 

30 U.S. Patent No. 5,578,461 to Sherwin et al.; International ApphcationNo. PCTAJS92/09627 
(WO93/09222)by Selden et al.; and hitemational ApphcationNo. PCT/US90/06436 
(W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the pol5^eptides of the 
invention in vivo, one or more genes provided by the invention are either over e^qjressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
5 control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are usefiil to detennine the roles polypeptides of the invention play in biological. 

10 processes, and preferably in disease states. Trarisgeiiic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28 122, incorporated herein by reference. ■ 

Transgenic animals can be prepared wherein all or part of a promoter of the 

15 polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the pol3^eptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 

20 known to confer promoter activation in a particular tissue. ' 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polj^septides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologovis recombination [Capecchi, Science 
244:1288-1292 (1989)]. Aiiimals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are usefiil as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCX 
Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Liactivation can be carried out using homologoxis recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
. can be supplemented by insertion of one or more heterologous enhancer elements known to 
10 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

15 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology wiU dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directiy or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 hehx formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may hkewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
commiinity for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of - 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 

10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the . 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for. 

15 example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al.. Cell 75:791-803 (1993)) to identify 
poljTiucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 

20 determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 

25 development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

30 Methods for performing the uses listed above are well known to those skilled ia the art. 

References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Clonit^ 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

35 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
5 such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 

particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
. polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

10 

4.10.3 CYTOIONE ANl) CHELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
. proUferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of tlie invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 
20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DAIG, TIO, B9, B9/1 1, BaF3, 
. MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or th)Tnocyte proliferation include without limitation those described 
25 in: Current Protocols in Immunology, Ed by J. E. Cohgan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Fimction 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al.. Cellular Immunology 133:327-341, 1991; Bertagnolli, 
30 et al., I. hnmunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without linaitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and- Shevach, E. M. In Current Protocols in Immimology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John WUey and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomiy, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al.. Nature 336:690-692, 1988; 
Greenberger et al., Proc, Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith etal., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin ll-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immimology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9~Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Coligan eds. Vol 1 pp. 6. 1 3 . 1 , John Wiley and Sons, Toronto. 1 99 1 . 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proUferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguhes, E. M. Shevach, W Strober, 

20 Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al.', Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur: J. hnmun. 11:405-411, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1.988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand ceU populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

r 

35 proteins which currently must be obtained from non-human sovirces or donors, implantation of 
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cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem ceU maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Fit- . 
3L), any of the interlexikins, recombinant soluble IL-6 receptor fiised to IL-6, macrophage - 

10 inflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture v/ill facilitate the production of large quantities of mature cells. Techniques 

15 . for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). . 

. Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 imdifferentiated totipotential/pluripotential stem cell lines that are useful as is or that, can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be usefiil for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the iavention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type jfrom undifferentiated stem cell populations involves the use of a ceU-type specific 

10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza, et al.. 
Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

1 5 accomplished by cultuiing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 

20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth . 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
'25 Bernstein etal.. Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOEESIS REGULAXmG ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjimction with irradiation/chemotherapy 
to stimulate the production of erythroid precirrsors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid ceUs such as granulocytes and monocytes/macrophages (i.e.. 
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traditional CSF activity) useftil, for example, in coajujiction. with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and prohferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 compUmentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 
paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem ceU compartment 

10 post irradiation/chemotherapy, either in-vivo or ex-vivo {i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor ceU transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

15 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, .1995; Keller et al.. Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et al.. Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Nati. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony forming cells 

25 with high proUferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
CeUs. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N Y. 1994; Neben et 
al.. Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. 1. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
5 circumstances v^^ere bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved iBxation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
10 of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. . 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
15 periodontal disease, such as through stimulation of bone and/or cartilage repEiir or by blocking 
inflammation or processes of tissue destruction (coUagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 

20 present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
humans and other animals. Such a preparation employing a tendon/Ugament-hke tissue inducing 
protein may have prophylactic use in preventing damage to tendon or Hgament tissue, as well as 

25 use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 

defects to tendon or ligament tissue. De novo tendon/ligament-lLke tissue formation induced by 
a cornposition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also usefid in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 

30 provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair: The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 

35 an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nervous system diseases and neuropathies, as well, as mechanical and tramnatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

peripheral nerve injuries, peripheral neuoropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting fi-om chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitatioh pressure ulcers, ulcers associated with vascular 
15 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention ihay also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be usefiil for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfiision injury in various tissues, and 
conditions resulting fi'om systemic cytokine damage. 
25 A composition of the present invention may also be usefial for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. . - 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
30 Intemational Patent Publication No. "^095/16035 (bone, cartilage, tendon); International Patent 
PubMcation No. WO95/05846 (nerve, neuronal); Intemational Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.). Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
-71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without limitation the activities for which assays are described 
herein. A poljoiucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCED)), e.g.,. in regulatiag (up or down) growth and 

10 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immime deficiencies may be genetic or be caused by viral (e.g., 
HIV) as weU as bacterial or fimgal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fimgal or other infection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 

15 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fimgal infections such 
as candidiasis. Of course,- in this regard, proteins of the present invention may also be usefiil 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia; gravis, graft- versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
. including antibodies) of the present invention may also to be usefiil in the treatment of allergic 
reactions and conditions {e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjvinctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 

30 suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic efiects of the 
pol3^eptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1 998), skin prick test (Hof&nann et al.. Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et al., Archu Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immime 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an- active, non-antigen-specijBc, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or energy 

10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the la,ck of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, wiU be usefiil in situations of tissue, skin and . 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should resuh in reduced tissue destruction in tissue transplantation. Typically, in tissue 
. transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immime cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet ceil grafts in mice, both of which have been used to examine 
the imm unosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al.. Science 257:789-792 (1992) and Turka et al., Proc. Nati. Acad. Sci USA, 89:11 102-11105 
(1992). In addition, murine models of GVHD (see Paul ed.. Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
- reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
. long-term reUef from the disease. The efficacy of blocking reagents in preventing or alleviating 
1 0 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes meUitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fimdamental Immunology, Raven Press, New York, 1989, pp. 
15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eUciting an initial, 
immune response. For example, enhancing an immune response may be useful in cases of viral 

20 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

25 patient. Another method of enhancing ahti- viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells ia vivo. 

30 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class n molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class 11 molecules, can be transfected with 
. nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and p2 microglobulin protein or an MHC class 11 alpha chain 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class 11 

proteins on the cell surface. Expression of the appropriate class I or class U MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
5 an antisense construct which blocks expression of an MHC class H associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a hmnan 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 
10 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Inamunology, Ed by J. E. CoUgan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Sheyach, W. Strober, Piib. Greene Pubhshing Associates and 
1 5 Wiley-Interscience (Chapter 3 , In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564^1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Talcai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 
20 CeUular Immunology 133:327-341, 1991; Brown etal., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Mahszewski, J. 
. Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and iSons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic ceU-dependent assays (which will identify, among others, proteins expressed by 
35 dendritic cells that activate naive T-cells) incIMe, without Limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Joumal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Joumal of Immunology 154:5071-5079, 1995; Porgador et 
al., Joumal of Experimental Medicine 182:255-260, 1995; Nair et al., Joumal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Joumal of 
5 , Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Joumal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Joumal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocj^e survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al.. Cytometry 

10 13:795-808, 1992; Gorczyca et al!. Leukemia 7:659-670, 1993; Gorczyca et al.. Cancer Research 
53:1945-1951,-1993; Itoh et al.. Cell 66:233-243, 1991; Zacharchuk, Joumal of Immunology 
145:4037-4045, 1990; Zamai et al.. Cytometry 14:891-897, 1993; Gorczyca et al.. International 
Joumal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 

15 include, without limitation, those described in: Anticaet al., Blood 84:111-117, 1994; Fine et al.. 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al.. Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACXmN/INHIBIN ACTIVITY 

20 A polypeptide of the present invention raay also exhibit activin- or inhibin-related 

activities. A poljTiucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of folhcle stimulating hormone (FSH). Thus, a polypeptide of the present invention, . 

25 , alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male manunals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin. group, may be usefiol as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be usefial ifor advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91 :562-572, 1972; Ling et al.. Nature 321:779-782, 1986; Vale et al.. Nature 
5 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as weU as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such ceil population. 

Preferably, the protein or peptide has the ability to directly stimulate directed moveinent of cells. . 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
•6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
10 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:13 1-140, 1986; Burdick et al.. Thrombosis Res. 
. 45:413-419, 1987; Humphrey et al.. Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

15 

4.10.11 CANCER DIAGNOSIS AM) THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 

20 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
. may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

25 Cancer treatments promote tumor regression by inhibiting tumor cell proUferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor ceU motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, limg cancers including small cell carcinoma and non-smaU cell 
cancers, breast cancers including small celL carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

35 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 

carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 

neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically . 
effective dosages alone or in combination with adjuvant cancer therapy such as sxirgery, 
chemotherapy, radiotherapy, themiotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

15 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a niixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
- 20 with the polypeptide or modulator of the invention include; Actinomycin D, Aminoglutethimide, , 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunombicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), 
Flbxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX); Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazine HCl, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate. Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, - 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 

invention as a potential cancer treatment. These in vitro models include proliferation assays of 

cultured tumor cells, growth of cultiu-ed, tumor cells in soft agar (see Freshney, (1987) Culture of 

Animal Cells: A Manual of Basic Techiuque, Wily-Liss, New York, NY Ch 1 8 and Ch 21), 

5 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boy den Chamber assays as described in 

Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 

of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 

cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

10 Chn. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 

e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTWrTY 

A polj^eptide of the present invention may also demonstrate activity as receptor, 
15 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their Ugahds, receptors involved in cell-cell interactions 
and their ligands (including without limitation, ceUular adhesion molecules (such as selectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
.25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in htunimology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Pubhshing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditioixs 7.28. 1 - 7.28.22), Takai et al., Proc 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Ex^p. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., CeU 80.661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, afiinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist req\iire the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods ki Enzymology Vol. 1 82 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or riiodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricia. 

15 4.10.13 DRUG SCREENING 

This invention is particularly useftil for screening chemical compoimds by using the 
novel polj^ptides or binding fiagments thereof in any of a variety of drug screening techniques. 
The polypeptides or Segments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably tiansformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such . 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or iriodulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily s5Tithesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fimgi), animals, plants or other vegetation, or marine organisms, and Ubraries of mixtures for 
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screening may be created by: (1) fennentation and extraction of broths jBrom soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof For a 
review, see Science 252:63-68 (1998). 
5 Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

organic compoxmds and can be readily prepared by traditional automated synthesis methods, 
PGR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 

10 For a reviev^r of combinatorial chemistry and libraries created therefrorn, see Myers, Curr. Opin. 
Biotechndl. (1997). For reviews and examples of peptidomimetic libraries, see 

Al-Obeidi et al., Mol. Bioteclmol, 9(3):205-23 (1 998); Hmby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
, Identification of modulators through iise of the various libraries described herein permits 

1 5 modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention! The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or anim al models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death.or prolonged survival of the animal/cells. 

20 The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

cholera, or with other compovmds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 
■ 25 ■ ■ 

4.10!l4 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 

30 expression cloning using manunalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different Ubraries used for the identification of compoimds, and in particular small.molecules, 

35 that modulate (i.e.. increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the otiier does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-e^qjressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

hgand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

10 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a . 
protein, whose ligand has been identified, is produced in a host cell. The cell is then, incubated 

15 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. . 

20 4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be xised to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced limg injury, inflammatory bowel diseiase, Crohn's disease or resulting fi-om 
. over production of cytokines such as TNF or EL-l . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
5 - intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes pr inhibits function of the polynucleotides and/or polypeptides of the 
10 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemizi, myeloblastic, promyelocj^c, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

•15- . . . ■ _ . 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving, cell types which can be tested for efficacy of 
intervention with, compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 

20 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 

25 nervous systems: 

(i) trairaiatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression " 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
30 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

35 tuberculosis, syphihs; 
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(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional, diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, foKc acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances . including alcohol, lead, or particular 
neurotoxins; and 

1 5 (viii) demyeltnated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are usefiil according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit . 
any of the following effects may be usefiil according to the invention; 

(i) increased siirvivaJ time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased productiori of a neuron-associated molecule in culture or in vzvo, e.g., 

choUne acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfimction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown ef al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding. Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfimction may be measured by 
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assessing the physical manifestation of motor neuron disorder, z.g., weakness, motor neuron 

conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

invention include but are not limited to disorders such as infarction, infection, exposvire to toxin, 

5 trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

well as cither components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 

muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

. miiscular atrophy, progressive bulbar pai^iiysis of childhood (Fazio-Londe syndrome), 

10 polioniyelitis and the post polio syndrome, and Hereditary Motorsensory Nem-opathy 

(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 

1 5 activities or effects: inhibiting the growth, infection or fimction of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effectrag (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 

20 effecting biorhythms or circadian cycles or rhythms; effecting the fertiUty of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); inamunoglobulin-like activity (such 

30 as, for example, the ability to bind antigens or compleruent); and the ability to act as an antigen 
in a vaccine composition to raise an immime response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 



61 



wo 01/57190 PCTAJSOl/04098 
The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic tieatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible tlie diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

10 . . Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
^ . polymorphism in the DNA. For example, PGR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

15 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genornic DNA depending on the presence or 
absence of the poljanorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Altematively a polymorphism resulting in a change in the aniino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the proteia, e.g., 
by an antibody specific to the variant sequence. 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immimosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterixim tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 
5 The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holosldtz above. An analysis of the data would reveal that the test compound 
10 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
1 5 0 Aer binding partners or modulators including antisense polynucleotides) of the invention have 
[ numerous applications in a variety of therapeutic methods. Examples of therapeutic apphcations 
include, but are not limited to, those exemplified herein.. 



4.11.1 EXAMPLE 

20 One embodiment of the invention is the adrninistration of an effective amovint of the 

polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An ' 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 

25 polypeptides or other composition of the invention Mali normally be determined by the 

prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01p,g/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.l^ig/kg to 10 mg/kg of patient body weight. For parenteral 

30 administratioh, polypeptides of the invention will be formulated in an injectable form combkied 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline. Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 

35 The preparatiori of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 

antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fiUers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
"phannaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend oh the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, EL-l, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-S, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoie^^ 
factor, and erythropoietin. In ftirther compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
. include various growth factors such as epidermal growth factor (EGF), platelet-derived grovrth 

20 factor (PDGF), transforming grovrth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either etihance 
the activity of the protein or oliier active ingredient or complement its activity ' or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
. 25 composition to produce a synergistic effect with protein or other active ingredient of the 

invention, or to minimize side effects! Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or ariti-inflammatory agent (such as 
IL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immimosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an altematdve to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g. , treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 amelioration of such conditions. When applied to an individual active ingredient, adniinistered 
. alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combiaatibni a therapeutically effective dose refers to coinbined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other activeJngredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cj^okines, lymphokines or other 

20 herhatopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
herhatopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously, with the c3^okine(s), lymphdkine(s), other hematopoietic 
factor(s), thrombolj^c or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 

25 ■ active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
. or intravenom injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as compHcation of glaucoina surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective . 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skiU in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
15 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

. be formulated in a conventional rnanner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a maimer that is itself known, e.g. , by means of conventional mixing, 

25 , dissolvmg, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizdng processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active .ingredient of the present 
invention is administered orally, protein or other active ingredient, of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in Kquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oU, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contaiiis from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
5 about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other, active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 

10 active ingredient solutions, having due regard to pH, isotordcity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for iatravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Inj ection. Ringer's Inj ection. 
Dextrose Inj ection,. Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 

1 5 other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution. Ringer's solution, or ' 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 

20 . barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
■ .art.- ~ , ' . 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutically acceptable carriers well knovra in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, piUs, dragees, capsules, 

25 liquids, gels, syrups, slurries, suspensions and the Like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, marmitol, or sorbitol; cellulose 

30 preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
glim tragacanth, methyl cellulose, hydroxypropylmethyl-ceUulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-Unked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 

35 purpose, concentrated sugar solutions may be used, which may optionally contain gum arable. 
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talc, polyvinyl pyirolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 , Pharmaceutical preparations which can be used oraUy include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsiiles, the active cornpounds may be dissolved or suspended in 

10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 

15 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propeUant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 

20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a . • 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g. , by bolus injection or continuous iniusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. , 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compoimds 
■ may be prepared as appropriate oily injection suspensions. Suitable lipophiUc solvents or 

30 vehicles include fatty oils such as sesame oil, or synthetic fatty-acid esters, such as ethyl oleate or 
triglycerides, or Uposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodiima carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or ageiits which 
increase the solubility of the compoimds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.^., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
5 glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated vwth suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 

10 sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

15 polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPDiSW) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compoimds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

20 co-solvent components inay be varied: for example, other low-toxicity iionpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycofmay be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g! polyvinyl pjrrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 

25 known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compoimds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 

30 skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compoimds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

35 or excipients. Examples of such carriers or excipients include but are not limited to calcium 



69 



wo 01/57190 PCT/DSO 1/04098 

carbonate, calcium, phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with pharmaceutically compatible counter ions. Such pharmaceuticjilly 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic amino acids, sodium acetate, potassiimi benzoate, triethanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 

10 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 

15 those encoded by class I and class n MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipatliic agents such as lipids which exist ia aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 

25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithiris, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

. 30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present iavention wiU depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient, hiitially, the 
35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is conteitnplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 |ig to about 100 mg (preferably about 0.1 |ag to about 10 mg, more preferably 
about 0.1 ng to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or hgament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 

10 composition for use in this invention is, of course, in a pyrogen-j&ee, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 

15 described above, may alternatively or additionally, be administered simultaneously or 

sequentiaUy with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or. other actiye ingredieht-contaiiiing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 

20 capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other irnplanted medical applications. 

Tlie choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular appHcation of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycoUc acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defioied, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

30 aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradabihty. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

35 glycohc acid in the form of poroiis particles having diameters ranging firom 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylceUulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 

10 The amoimt of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amoimt necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatnent of the bone and/or cartilage defect, woimd, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(5), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred.horses, ia addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone)i the patient's age, sex, and diet, the severity of any infection, titne of administration and 
other clinical factors. .The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also efiect the dosage. Progress can be monitored by periodic assessment of tissuei/bone 
growth and/or repair, for example. X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into ceUs for expression in a 
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mammalian subject Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for vse in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 

10 intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 

15 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine usefijl doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range tiiat includes the IC50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half ^maximal inhibition of the protein's biological activity). 

20 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptom's or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compoimds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
. 25 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in hximan. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the ED50 with little or no toxicity. - The dosage may 

vary within tMs range depending upon the dosage form employed and the route of adrninistration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et ai., 1975, in "The . 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 

35 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
. administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
1 0 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 M-g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 ng/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

15 The amount of composition administered will, of course, be dependent on the subject 

. being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a bhster pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. Fab, Fab- and F(ab')2 
fragments, and an Fab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as weU, 
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such as IgGi, IgGa, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
5 portion or fragment thereof, and additionally can be used as an immunogen to generate 

antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 

10 of the full length protein, such as an amino acid sequence shown in SEQ ID NO:985, and 

encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the frill length protein or Avith any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

1 5 epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 

20 indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues usefiil for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 

25 Hopp and Woods, 1981, Proc. Nat. Acad Sci. USA 78: 3824-3828; Kyte and Doolitde 1982, J. 
Mol. Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

30 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example. Antibodies: A Laboratory 
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Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 



5.13.1 Polyclonal Antibodies 

5 For the production ofpolyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immxmogenic protein, or a 

1 0 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immxmogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can fiirther include an 
adjuvant. Various adjuvants used to increase the immimological response include, but are not 

1 5 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
■ dinitrophenol, etc.), adjuvants usable in humans such as BaciUe Cahnette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., ftom the blood) and further purified by well known techniques, 
such as aSinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specijBc antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobiUzed on a column to 
purify the immune specific antibody by inmiunoaflSnity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecxdes that contain only one molecular species of 
antibody molecule consisting of a imique hght chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding afifinity for it 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein. Nature. 256:495 (1975). In a hybridoma method, a mouse, 
5 hamster, or other appropriate host animal, is typically immunized with an immimizing agent to 
elicit lymphocj^es that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphoc)^es can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of himian origin 

10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Coding, Monoclonal Antibodies: 
Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 

15 Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

20 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the SaDc Institute Cell Distribution Center, San Diego, 

25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of himian 
monoclonal antibodies (Kozbor, J. Immunol.. 133 :3001 (1984); Brodeiu: et al.. Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directesd against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immimoprecipitation or by an in vitro binding assay, such as radioimmimoassay (RIA) or 
enzyme-linked immvmoabsorbent assay (ELISA). Such techniques and assays are known in the 

35 art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem.. 102:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified firom the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxy lapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of mvirine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce inmnmoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 16,567; Morrison, Nature 368. 
812-13 (1994)) or by covalently joining to the immimoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-itnmimoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can fiirther comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immimoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immimoglobulia chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immimoglobulin. 
Humanization can be perfonned following the method of Winter and co-workers (Jones et al.. 
Nature. 321 :522-525 (1986); Riechmann et al.. Nature. 332:323-327 (1988); Verhoeyen et al.. 
Science. 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
5 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

10 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the firework regions are those of a human 
immimoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol.. 

15 2:593-596(1992)). 

5.13.3 Human Antibodies 

FuUy human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 

20 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1 983 Immiinol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
ANTBODffiS AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 

25 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1 983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming hmnan B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal ANTrooDffiS AND Cancer Therapy, Alan R. Liss, Lie, pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

30 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol.. 227:38 1 (1 991); 
Marks et al., J. Mol. Biol.. 222:58 1 (1 991)). Similarly, human antibodies can be made by 
introducing human imm unoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulm genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

35 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
CNature 368 856-859 (1994)): Morrison ( Nature 368. 812-13 (1994)); Fishwild et al.( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger fl^Jature Biotechnology 14. 826 (1996)); and 
5 Lonberg and Hnszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modiJBed so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

10 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing tfie requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhimian animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immimoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immimoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous iirununoglobulin heavy chaia is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immimoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem ceU a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771 . It includes infroducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a fhrther improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immxmogen, and a correlative method for selecting an antibody that binds 
5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCX publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
10 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof Antibody fragments that contain the idiotjqjes to a protein antigen 
1 5 may be produced by techniques known in the art including, but not Hmited to: (i) an F(ab')2 

fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an Fab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) Fy fragments. 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a ceU-surface protein or receptor or receptor subunit. 

25 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immimoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and CueUo, Nature . 305:537-539 (1983)). Because of the random 
assortment of unmunoglobulin heavy and Ught chains, these hybridomas (quadromas) produce a 

30 potential mixture often different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accompUshed by ajSinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker a/., 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

35 combining sites) can be fused to immimoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the iSrst heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
5 Ught chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al.. Methods in Enzvmology. 121 :210 ri986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

1 0 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

15 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as fiill length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generatmg bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

20 prepared using chemical linkage. Brennan et al.. Science 229:8 1 (1 985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab' -TNB 

25 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

30 coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

35 of human cj^otoxic lymphocytes against himian breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelnv et al.. I Immunol. 148(5):1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
5 different antibodies by gene fiision. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Nati. Acad. Sci. USA 90:6444-6448 (1 993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

10 heavy-chain variable domain (V h) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fi-agments by the use of single-chain Fv (sFv) dimers has also been 

15 reported. See! Gruber et al.. J. Immunol. 152:5368 (19941 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immvmol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

20 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-ceU receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and FcyRIH (CD 16) so as to focus ceUular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 

25 possess an antigen-binding arm and an arm which binds a cj^otoxic agent or a radionuclide 
chelator, such as EOTUBE, DPT A, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and fiirther binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

30 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to imwanted cells (U.S. Patent 
. No. 4,676,980), and for ti-eatinent of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vifro xising known methods in synthetic 

35 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a tbioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond, 
formation in this region. The homodimeric antibody thus generated can have improved 

10 intemaLi2ation capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1 191-1195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifimctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

1 5 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
20 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 

25 diphtheria A chain, nonbinding active fragments of diphtheria toxki, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccia A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogeUin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
^'X"'l,^''ln,'''Y,and^«^Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifimctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
imiaothiolane (IT), bifimctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-ethylenediamiiie), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al.. Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene tiiaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utili2ation in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directiy by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presentiy known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable mediimi. A skilled artisan can readily adopt any of the 
presentiy known methods for recording information on compiiter readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats (e.g. text file or database) in order to obtain computer readable mediiim having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
infonhation for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to iaccess sequence information provided in a computer readable medium. The 
examples which foUow demonstrate how software which implements the BLAST (Altschul et 

10 al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Bratlag et al., Comp. Chem. 17:203-207 
(1 993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as en2ymes used in fermentation 
reactions and in the production of commercially usefiil metabolites. 

15 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing imit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conductiiig homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

1 0 As used herein, "a target structural motif," or "target motif," refers to any rationally 

selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which.is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

15 to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
20 control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use ia these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple heUx - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
25 . et al.. Science 25 1 : 1360 (1 991)) or to the mRNA itself (antisense - Ohnno, J. Neurochem. 56:560 
(1991); 01igodeox5Tiucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of KNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
30 Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention fiirther provides methods to identify the presence or expression of 
35 one of the ORFs of the present invehtion, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample imder stringent hybridi2ation conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 

20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skiUed in the art will recognize that any one of the commonly available hybridization, 
amplification or irrammological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be fovmd in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 

25 Amsterdam, The Netherlands (1 986); Bullock^ G.R. et al.. Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1 985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 

30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
wiU vary based on the assay format, nature of the detection method and the tissues, ceUs or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 

10 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 

1 5 contain the reagents used to detect the boimd antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, qr antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readUy recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

20 established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useftil in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
25 invention is involved in the immime response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imagbag agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

30 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 

encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 

the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 

invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

15 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 

sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 . sequence expression, so that if a pols^eptide/compoimd complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 

activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 

activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 

absence of the compound). Compounds, such as compoimds identified via the methods of the 

invention, can be tested using standard assays well known to those of skill in the art for their 

ability to modulate activity/expression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 

and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art.can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et al.. Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al.. Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the Uke. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

10 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

15 by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al.. Science 241:456 (1988); and Der\'an et 
al.. Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1 991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACmS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
conesponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PGR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PGR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe wiQ comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to sjoithesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the ^propriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Ghromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265: 198 If). Gorrelation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
diSerences in gene sequences between normal, carrier or affected individuals. 
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4J.Q PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide sjTithesizer. 
5 Support bound oligonucleotides may be prepared by any of the methods known to those of 

skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al, 1 985; Dahlen et al., 1 987; Morrissey & Collins, (1989) Mol. Cell 

10 Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller efo/., 1988; 1989);all 
references being specifically incorporated hereiiL 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude etal. (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 

1 5 streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased fi-om various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, EL) is also seUiag suitable material that could be used. Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. CovaLinkNH is a polystyrene surface grafted with secondary amino 
groiips (>NH) that serve as bridge-heads for fiirfher covalent coupling. CovaLink Modules may be 
purchased firom Nimc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussene/a/., (1991) Anal. Biochem 198(1)138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). Hiis is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond j oins the DNA to the 
3 0 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 

grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oHgonucleotideto 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently boimd to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ngAal) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M l-mettiylimidazole, 
pH 7.0 (l-Melm?), is then added to a final concentration of 10 roM l-Melm?. A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 
5 Carbodiimide0.2M l-ethyl-3-(3-dimethylaininopropyl)-carbodiimide(EDC), dissolved in 

10 mM 1-Melm7, is made fi-esh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nimc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (wiiere in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a fiarther suitable method for use with the present invention is that 

described in PCX Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a siqjport involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiesterUnk to aliphatic 
hydroxyl groups carried by the support. The oligonucleotideis then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide firom the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
FodoTetal. (1991)Science251(4995)767-73,incorporatedhereinbyreference. Probes may also 
be immobilized on nylon supports as described by Van Ness etal. (1 99 1 ) Nucleic Acids Res. 
1 9(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1 988) Anal. Biochem. 
169(1) 104-8; aU references being specifically incorporated herein. 

25 To link an oligonucleotide to a nylon support, as described by Van Ness a/. (1991), 

requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to \itilize the 
light-generated synthesis described by Pease et al., (1994) PNAS USA 91(1 1) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabUe 
5'-protected#-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

35 generated in tills maimer. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and KNA, 
includingmKNA without any amplification steps. For example, Sambrook ef (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PGR or other amplificationmethods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrook et 
a/. (1 989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer e/ a/. (1990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentation methods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al. (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blvmt ends. Atj^ical reaction conditions, which alter the specificity of 
this enzyme (CviJI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUCl 9 (2688 base pairs). Fitzgerald e/a/. (1992) quantitatively evaluated the 
randomness of this firagmentation strategy, using a CvfJI* * digest of pUC 1 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
Ml 3 cloning vector. Sequence analysis of 76 clones showed that Cvf Jl* * restricts pyGCPy and 
PuGCP*u, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in Miiich the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
cfuickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm^, depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 

1 5 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent repUca spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amphfied gene segment may be in 
one 96- well plate (all 96 wells containing the same sample). A plate for each of tbe 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm'^ and there may be a 1 irmi space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, NaperviUe, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane appUed to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate fhat many other embodiments and variations 
30 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exempUfied embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art igjon 
consideration of the present preferred embodiments. Consequentiy, the only limitations which 
should be placed upon the scope of the invention are those which appear in the upended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
5 reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A pluraUty of novel nucleic acids were obtained fi-om cDNA libraries prepared from various 
1 0 human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the hbrary were amplified with PCR using primers specific for the vector sequences which 
, flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
1 5 into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced usiog a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done usiag a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
20 Amplification of cDNA Ends) was performed to fiarther extend the sequence in the 5 ' direction. 

5.2 EXAMPLE2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ED NO : 1 969-295 1 , 
and 3 949-3 954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 

25 used to extend the seed EST into an extended assemblage, by pulUng additional sequences from 
different databases (i.e., Hyseq' s database containing EST sequences, dbEST version 1 1 4, gb pri 
114, and UniGene version 101) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BL ASTN hit to the 

30 extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present invention, and 
then- correspondiQg nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted. Method A refers to a 
polypeptide obtained by using a software program called F ASTY (available fiom 
http://fasta.bioch.virginia.edu') which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
5 (1990), herem incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositional properties (C.'Burge and S. Karhn, J. Mol. Biol., 268:78-94 
(1997), incorporated herein by reference). Method C refers to a pol5T)eptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses tiie 
polypeptide with the longest open reading fiame. 

5.3 EXAMPLES 
Novel Nucleic Acids 

1 5 Using PHRAP (Univ. of Washmgton) or CAP4 (Paracel), full length gene cDNA sequences 

and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The fiill-length nucleotide sequences are shown in the 
Sequence Listmg as SEQ ID NO: 1-35 1 . The amino acids are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO : 1 -3 5 1 . 

The nearest neighbor results for SEQ ID NQ: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 

25 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated anuno acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1 -35 1 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), aU the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 
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Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain fovmd, the description, the p-valiie and the pFam score for the identified domain 
5 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
10 Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shov^^ the position of the signal peptide in each of the polypeptides 
15 and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fvill length gene cDNA 
20 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 7, gb pri 1 17, 
, UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
25 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS : 3 52-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
30 version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ED NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs wdth 
identifiable functions for SEQ ID NO: 352-766 are shown ui Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region foimd in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
10 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be deteinune from using Neural Network Signal? Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
1 5 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5 E;XAMPLE5 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a fiiU length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any fiame 
shifts and incorrect stop codons were corrected by hand editing. -During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 11 8, gb pri 118, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The fiill-length nucleotide, including splice variants resulting fixDm 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences are SEQ ID NO: 1 75 1 - 1 91 4. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-\yashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
fiinctions for SEQ ID NO: 767-930 are shown m Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.^ Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
15 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nieison, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporaited herein by 
reference. A maximum S score and a mean S score, as described in the Nieison et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE6 
Novel Nucleic Acids 

30 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a Ml length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 118, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washiagton) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1-965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1915-1 949. 
5 Table 1 shpwsthevarioustissuesourcesofSEQ ED NO:931-965. 

The nearest neighbor results for SEQ ID NO: 93 1 -965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 93 1-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
15 signature region foimd in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Demnark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Guimar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), mcorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the pol5T3eptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE7 . 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 9, gb pri 119, 
5 UniGene version 119, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966-974. The corresponding 
amino acid sequencesare SEQ ED NO:1950-1958. 
1 0 Table 1 shows the various tissue sources of SEQ ID NO : 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.6al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest . 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amiao acid sequences for 
15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
20 signature region foimd in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
25 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP V 1 . 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Guiuiar von Heijne inthe 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximimi S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.8 EXAMPLES 
Novel Nucleic Acids 

5 UsmgPHRAJP (Univ. ofWashington) or CAP4(Paracel), a fiill length gene cDNA 

sequence and its corresponding protein sequence \yere generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including sphce variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The corresponding 
amino acid sequences are SEQ ID NO: 1 959- 1 968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

15 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 1 9MP-WashU search gainst Genpept release 120 and Geneseq October 2 1 , 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ED NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable fimctions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al.. Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain fouuad, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, aad Gunnar von Heijne in the 

publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 

cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.9 EXAMPLE9 
Novel Nucleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a Ml length gene cDNA 

sequence and its correspondiag protein sequence were generated fi-om the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120,gbpri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including spUce variants resulting fiom 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shov^ the various tissue sources of SEQ ED NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtamed by a BLASTP 

version 2.0a] 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed Hhe closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable fimctions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-;235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 1 0 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Soimhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certaia peptide domains. Table 1 1 shovra the name of 
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the domain foimd, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine firom using Neural Network SignalP VI . 1 program (fi-om 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielsen, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of then- 
cleavage sites" Protein Engmeering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypqptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal ^ 
peptide. 

Tables 5 and 1 3 are correlation tables of all of the sequences and the SEQ ID NOS. 



TABLE 1 



Tissue Origin 


RNA 
Source 


Library 
Name 


SEQ ID NOS: 


lung 






3 11 25 49 65 75 114 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 63 3 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 


adult brain 


GIBCO 


AB3001 


1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 312 314 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 


adult brain 


GIBCO 


ABD003 


3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 188-190 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 


adult brain 


Clontech 


ABROOl 


3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 46 i 542 
583 586 606-607 611 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 


adult brain 


Clontech 


ABR006 


19 3249 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 


adult brain 


Clontech 


ABR008 


2-3 9-11 14 1721 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 1 01 103 112-115 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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403 405 409-412 414 418-421 423-424 








426-427 430 433-437 443 445-450 452 








456-457 460 462 464 471 479 482-483 








485 488 490-498 505 507 510 516 519- 








522 524 527-532 535 538-539 542-545 








548 551 553 555 561-562 566 569 571 








574 580-583 588-589 593 597 601-608 








611-612 614-615 617-618 621-622 624 








630-635 642 644 646-648 650-652 655 








657 659-661 664-665 668 672 674 689 








693-699 701-702 708 711 715 717 724 








728-730 732 734-735 738-740 745 747- 








750 753-755 757 761 763-764 766-769 








772-773 775 780-781 789-791 793-795 








799-800 802-806 809 812 818-819 821- 








822 826 829-830 832 834-835 841 843 








845 856 858-859 861 864 866 870 872 








876 880 883 885 887 893-898 902 906- 








916 918 921 925-926 930-931 933 942- 








943 946 948 950-951 953-954 958-960 








962-965 967 969-970 972 977 


adult brain 


Clontech 


ABROll 


57 196 270 304 344 436 834 


adult brain 


BioChain 


ABROll 


14 82 121-122 168 691 


adult brain 


Invitrogen 


ABR013 


72 108 263 270 336 425 492-494 732 787 
790 826 880 . 


adult brain 


Invitrogen 


ABR014 


293 394 399 764 768-769 928 967 


adult brain 


Invitrogen 


ABR015 


738-739 764 


adult brain 


Invitrogen 


ABR016 


320 374 396 399 405 684 742-743 767 
931 947 967 


adult brain 


Invitrogen 


ABT004 


21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 341 346 348 
371 374 388 391 394 399 401 409 411 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 
624 640 643 647 681 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 


cxiltured 


Strategene 


ADPOOl 


4 28-29 69 93 114 121 132-133 135 151- 


preadipocytes ■ 






152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 
312 314 326-327 361-362 371 374 388 
394 401 403 405 411 420 437 453 466- 
467 470 474 478 496 507-509 517 530 
532-533 584 588 593 602-603 608 610 
617-621 630-631 633 639 642-643 661 
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693 729 746 .761 765 769 834 842 848 
887 907 923 947-950 957 967 969 


adrenal gland 


Clontech 


ADR002 


1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 
562 564 588 597 602-603 606-607 635 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 


adult heart 


GIBCO 


AHROOl 


1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 
41 48 54-57 65 69-72 75 78 80 82-83 97 
99-100 108 112-115 117-121 123-124 
128-133 141 144-146 149 152 159 162- 
163 168 172 176 179 181 184 186-187 
190-191 201 203 208-209 212 216-218 
221 223 227 229 233 244 247 249 253- 
255 258 263-264 267 269-270 274 278 
280-282 285 289 291 295 297-299 301 
303-304 308 313 317 321-322 326 328 
334 344 348 352 358 361-363 370-371 
380 382-383 388 394-396 398 401 403 
405-406 410-416 423 425-427 430-431 
436 452-453 464-465 470-474 481-484 
487-488 490 492-494 496 499-500 505- 
506 508-509 514 523 529-530 533 547- 
548 553 558 563-565 577-578 586-588 
590 593 597 601-603 606-608 610-613 
617-619 621-622 626-628 637-63 8 642- 
644 652 658 661 672 682-683 688 691 
693 697 699 708 711 713 715 732 737 
745 747-748 750-753 759 761 765 768- 
770 775 790 802-803 814-815 818-819 
830 837 839-840 842 845 848 859 861- 
862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 


adult kidney 


GIBCO 


AKDOOl 


1.3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-110 114- 

11/^ "IIO "lOI ■t'^^ lOO I'^rt I'll 1*^^ 

116 118-121 123-125 128 130-133 135 
138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 211-212 216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 
864 867 870 876-877 887 889 892-894 
896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 


adult kidney 


Invitrogen 


AKT002 


1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333355 363 
372 380 383 396 399 402 418-419 426- 
427 431 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 699 704 713 732 745 752-753 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 


adult lung 


GEBCO 


ALGOOl 


1 3 14 18 28-29 38 54-56 59 92 110 114- 
115 130-131 146 149 156 159 164 167 
176 184 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 
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967 


lymph node 


Clontech 


ALNOOl 


3 10 110 146 160 168 196 209 221 269 
278 301 336 348 394 405 411 420 422 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 


young liver 


GIBCO 


ALVOOl 


3 14 16 37-38 41 51 56 60 97 104-105 
108110117119128 130-131 134139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 
547 551 563 581 583 610-611 621 624 
635 643 691 708 711 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-911 
949 958 965 969 972-973 


adult liver 


Invitrogen 


ALV002 


3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 45 1 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 
587 594-595 604-605 608 610 621 630- 
63 1 634-635 637 657 664 690 693 699 
723 726 745 751 763 767 784 793 811 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 


adult liver 


Clontech 


ALV003 


60 134 169-171 275 


adult ovary 


Invitrogen 


AOVOOl 


1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
211-212214217219221224 226 232- 
235 240-242 246-247 249 25 1 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 
354-358 361-363 365 368 370-372 374 
376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 ' 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 

QQl 0O1 QQA 8Q< ffQ'7 QHI Oni On< Q1 1 

913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 

QKV Q^S QAO Q/i'i OA*; QA7 QAQ 071 QTi 
yjl-yjo y\>/.-yOD yOO yO 1 y\)y yllylD 

977 981-982 


adult placenta 


Invitrogen 


APLOOl 


41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 


placenta 


Invitrogen 


APL002 


3 21 31 38 63-64 78 135 143 168 186-187 
Zl2 2Jz 244 zoi 2&U-2ol 334 iio 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 


adult spleen 


GIBCO 


ASPOOl 


1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 

AOQ 71 ^ 70fl nO'X 700 747 7A8 7AQ 77n 
Oyy ilj /ZU /Z3 iZy /^i-iHo lOy-IIK) 

782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 


testis 


GIBCO 


ATSOOl 


6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 

Old. 001 00"^ O'XCi O'sAJl's^ 0^8 Ofx'X 0^0 
ZIt- ZZl ZZJ Zju Z-Jt—Zjj ZJo ZOJ ZOj' 

283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 
620 623-624 638 642-643 697 699 708 
711 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 


Genomic DNA 
from BAG 
63118 


Research 
Genetics 
(CITE BAG 
Library) 


BAGOOl 


515 


Genomic DNA 
from BAG 
39316 


Research 
Genetics 
(CITB BAG 
Library) 


BAG002 


640 


Genomic DNA 
from BAG 
39316 


Research 
Genetics 
(CITB BAG 
Library) 


BAG003 


640 


adult bladder 


Invitrogen 


BLDOOl 


50 55 66 71 111 143-144 148 160 201 209 
223 255-256 280-281 286 305 315 319 
340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 


bone marrow 


Glontech 


BMDOOl 


3 10-13 16 18 20-2125 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 1 10 1 14-115 1 18- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267-269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388.394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713717731734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 
817-818 822 834 839-840 842 848 862 
866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 


bone marrow 


Glontech 


BMD002 


3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333.340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 41 1 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 745 752 755 
761 767 lo9-ll\ 115-116 784 787 811 
oiD bio oil o4U 542 849 859 67o 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 


bone marrow 


Clontech 


BMD004 


54 


bone marrow 


Clontech 


BMD007 


766 887 928 


adult colon 


Invitrogen 


CLNOOl 


22 37 67 97 117 121 148-149 168 172 190 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 723 728 751 761 
831 861 887 914-916 934 955 969 984 


Mixture of 16 

LiooUCo — 

mRNAs* 


Various 
V enuors 


CTL016 


358 740 760 


Mixture of 16 
tissues - 
mRNAs* 


Various 
Vendors* 


CTL021 


468 527 928 


adult cervix 


BioChain 


CVXOOl 


1 3 10 14 22 28-30 37 41 47-48 51-52 54- 

57 71 82 89-9092 106108 110-111 117- 

118 121 129-131 135 141 143-146 160- 

161 164 168 172 177 189-190 193 195 

200 204 209 211-212 217 226 229-230 

232 234-235 240-242 246 254 260-263 

268-270 274 277 282 285 292 295 297 

305-308 314-316 319 328 343-344 348 

354 358 363 368 380 382-384 389 394 

396 399 401 405-407 410 416 418-421 

428 430-431 437 442 453-454 459 464 

469 471-473 476 480 484 492-495 500 

504 506-509 516-517 526 530 532 545 

550-551 563-565 569 577-578 585-586 
^on /^n52 ^11 ^11 fs> 1 /COO /cm 

jy\) OUo oil 01 J Oiy OZl Dzj O/o OjU- 

631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA Qiivitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) huinan bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clomech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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119-1 qO 784 788 810-811 813-815 822 

Q1A QIC Oin QICi OAQ 0<C1 OC/Z 0£.n 0^71 

oj4 ojo-oi/ ojy o4o ool ooo-oo? 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 


diaphragm 


BioOiain 


DIA002 


3 39 184 203 431 563 848 967 


endothelial 
cells 


Strategene 


EDTOOl 


3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 114-115 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
301-304 308 311 313-314 316-317 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-63 1 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 . 
765 767-770 772-773 779 784 789 792- 
794 7S>6 802-803 8 1 1 8 1 7-8 1 8 82 1 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 


VJCIIUIIIIC 

clones from the 
chromosome 8 


— : 

Cjenomic 
DNA from 

VJC11CL1L> 

Research 


JirMUUl 


jZ^ did o4y 


esophagus 


BioChain 


ESO002 


97 103 128 371 474 


icLai urajn 


v^ionxecn 


rUKUUl 


0/ 129 156 159 2j2 267 433 446 503 845 
952 


fetal brain 


Clontech 


FBR004 


28-29 185 213 277 350 384 432 485 501 
549 65 1 747 754 76 1 780 787 848 870 

887 QCifi O'iS 


fetal brain 


Clontech 


FBR006 


10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 611 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-893 
835 843 845 856 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 


fetal brain 


Clontech 


FBRsOS 


130-131 312 517 637 691 738-739 


fetal brain 


Invitrogen 


FBT002 


3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
fi'id M9-643 ^^47-648 650 679 689 6Q3 

\JJt v)*TZf vtt^J Vj't/ XJt^O vI-JVI \j I J \jO^ v/^J 

699 712 715 742-743 745 748-749 753 
768-769 793 797 829-83 1 834 845 848 
856 859 893-894 908-909 913 916 93 1 
933 940 950 967 969 


fetal heart 


Invitrogen 


FHROOl 


19 57 130-131 394 431 642 769 844 


fetal kidney 


Clontech 


FKDOOl 


3 31 33-34 38 48 54 72 160 208-209 211 
223 264 269 277 283 290 313 325 341 
348 358 396 41 8-490 474 484 506 508- 
509 517 520-521 532 547 553 558 567 
569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 


fetal kidney 


Clontech 


FKD002 


19 474 726 903 


fetal kidnev 


Ttivilrnopn 


FKD007 


3 118 186-187 230 244 271 432 887 969 


fetal lung 


Clontech 


FLGOOl 


69 132-133 156 168208-209 217 267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 


fetal lung 


Invitrogen 


FLG003 


3 8 28-29 32 39 50 66 82 88 92 168 186- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 


fetal lung 


Clontech 


FLG004 


130-131 394 664 769 942 


fetal liver- 
spleen 


Columbia 
University 


FLSOOl 


3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 211-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273.-282 284-285 288- 
290 292-295 297-299 301-306 308 311- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-411 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 511 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 811 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 . 
874-878 887-888 891 893-894 896-900 
903 905-911 913 916 918 923 928 930- 
93 1 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 . 


fetal liver- 
spleen 


Columbia 
University 


FLS002 


3 8-13 15-17 19-20 22 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 112 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 

010 Ol/^ OlO OO/C A OQO 

Zlz 214 zlo-zl6 zZj-zz4 ZZo-Z5\j zjz- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-41 1 413 418- 
421 429 431 439-440 442-444 451-452 
457 462r463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708711 713 715 
7 1 7-7 1 9 723-727 729 73 1 -734 73 8-73 9 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 
865 867 874-878 888 891-892 896-900 • 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 
947 949-950 953 955-956 958-959 961- 
963 965 968-970 973 977-978 981 


fetal liver- 
spleen 


Columbia 
University 


FLS003 


19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 


fetal liver 


Invitrogen 


FLVOOl 


37 55 60 69 72-73 97 104-105 108 1 13- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 317 ' 
336 342 348-349 358 371 374 382 394 
402 411-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- • 
603 606-608 617-618 624 634 639 642- 
644 646 664-665 669 679 7 1 5 7 1 7 720 
726 745 748 751 769-770 782 791 794 
797 824 830-83 1 845-847 852 859 870 
899 913-916 925 928 948 956 958 969 
976 982 


fetal liver 


Clontech 


FLV002 


72 418-419 632 


fetal liver 


Clontech 


FLV004 


3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 


fetal muscle 


Invitrogen 


FMSOOl 


15 27 32 37 67 72 83 99 112 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 
549 554 558 579-580 583 602-603 608 
639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911923 948 967 



118 



wo 01/57190 



PCTAJSOl/04098 



fetal muscle 


Invitrogen 


FMS002 


15 99 130-131 223 361-362 431 474 505 
581 639 643 666-^61 784 790 808 810- 
81 1 874 880 887 903 946 950 958 962- 
963 973 


fetal skin 
• 


Invitrogen 


FSKOOl 


3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 1 1 1-1 12 1 15 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 
740 745 748 752 765 768-769 772-773 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 
950 958 962-964 967 975 


fetal skin 


Invitrogen 


FSK002 


3 130-131 146 194 306 354 367 400 405 
474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 


fetal spleen 


BioChain 


FSPOOl 


276 563 842 


umbilical cord 


BioChain 


FUCOOl 


3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 
769 11^-115 793 797 807 818 822 837 
848-849 856 862 868-869 874 885 887 
892-894 903 906-907 916-917 919-920 
928 936 939 944 946-947 962-963 967 
969 
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fetal brain 


GEBCO 


HFBOOl 


3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- . 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 711-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 8 13 '8 18-8 19 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 . 
896-897 900 906-907 9 1 0-9 1 1 9 1 8 921 - 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 


macrophage 


Invitrogen 


HMPOOl 


86 168 186-187 297 537 608 681 761 845 
877 


infant brain 


Columbia 
University 


IB2002 


2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 
482-483 488 495-496 498 501 503-504 
506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745-. 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-911 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 



infant brain 



Coliimbia 
University 



IB2003 



3 12-13 21 27-29 
113 116 126 128 
176-177 184-185 
224 228 230 244 
276293-294 312 
346 354-355 358 
394 396 399 402 
474 482 484 488 
524 529 540-541 
589 596 600-603 
620-621 632 647 
735-736 746 751 
800 807 811-813 
834 838-840 843 
919-920 925 930 
973 982 



32 39 49 69 72 82 91 
132-133 142 144 156 
188 194 208 212 223- 
255 259 267 270 273 
320 326-327 337 342 
361-363 382 388 390 
420425 431442 462 
495-496 510 520-522 
549 563 582 586 588- 
606-607 612 617-618 
650 679 720-722 724 
754 769 785-786 793 
818-819 822 824 831 
856 864 892 896 907 
-931 936 947 950 957 



infant brain 



Colvimbia 
University 



IBM002 



16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 



84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 



infant brain 



Columbia 
University 



msooi 



liuig, fibroblast 



Strategene 



LFBOOl 



3 11 25 49 65 75 
190198 209 217 
269 274 277 282 
334 336 352 372 
453 464 470 481 
539 581 584 617- 
688 691 745 752 
848 876 887 953 



114 141 156 
224 229 234- 
284.303 308 
396 398 412 
492-494 508 
■619 621 628 
761 768 794 
967 973 



160 172 
■235 267 
312 320 
414 437 
-509 532 
633 643 
822 837 



lung tumor 



Invitrogen 



LGT002 



1 3 9-10 12-13 20 31 38 41 46 48 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179 184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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1 


294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 

QAQ 0<C 0C7 OCO O/^O QCA Q/Z/Z Onr\ OTC 

64o 6 J J 63/ ojy aoZ oo4 ooo o/U o/j- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 


lymphocytes 


ATCC 


LPCOOl 


3 9-1 1 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 3 1 1 3 14 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 

^'7Q /CH/f /CAC /Tl A /TOA /COO iCOT /ZA'y /CCO 

D/y oU4-oUd olU o2U oZo bil 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 


leukocyte 


GIBCO 


LUCOOl 


1 3 9 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78-. 
80 82 89-90 93 99 1 10 1 15-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 211-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 

492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 
875-877 887 891 893-894 896-898 903 
906-91 1 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 


leukocyte 


Clontech 


LUC003 


1 41 82 106 119 123-124 160 177 184 201 
212 221 228 271 279 285 295 321 325 
372 394 41 1-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769 775 789 
809 867 887 923 928 950 


melanoma 
from cell line 
ATCC #CRL 
1424 


Clontech 


MEL004 


3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 
507 524 532 539 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 71 5 724 748 752 761 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 


mammary 
gland 


Invitrogen 


MMGOOl 


1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
3 1 1 3 1 3-3 14 3 17 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 411-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 

<on ^oo CO/1 ^Qn con cm cid 

joU joz J64 jo/-Doy Dyi Dy/ oUl-olU 
612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650^657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- • 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 
821-824 830-833 837 848 856 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-911 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 


induced neuron 
cells 


Strategene 


NTDOOl 


9 65 82 92 106 113 142 146 156 172 176 
191 208 221 258 277 328 333 346 361- 
362 371-372 375 388 410 414 418-419 
440 471 484 495 516 524 529-530 592 
610 628 642 650 745 74o 752 761 793 
818 848 851 897 


retinoid acid 
induced neuron 
cells 


Strategene 


NTROOl 


19 87 184 305 385 440 474 626-627 643 
748 799 834 977 


neuronal cells 


Strategene 


NTUOOl 


19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
312 317 329 340 361-362 367 379 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 

/' r Ct 'TOO TvlA T/T^ l^rV TO/I 'TAI OAO '^AA 

658 732 740 765 769 784 791 793 799 

OA'^ OAO 01 O O y1 OC1 Q^A OAT AAT AO'l 

802-803 olo o42 o51 !so4 o97 907 932 


pituitary gland 


Clontech 


PIT004 


3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 


placenta 


Clontech 


PLA003 


138 176 574 896 972 


prostate 


Clontech 


PRTOOl 


3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 

C i\C C r\f ^OO ^OT C C ^ A ^OO ^ AO ✓'AO 

505-506 523 537 543 564 583 602-603 
611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 


rectum 


Invitrogen 


RECOOl 


19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 
420 425 442 446 459 483 485 520-521 
532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-911 
914-916 934 937-938 942 967 973 982 


salivary gland . 


Clontech 


SALOOl 


16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 

OAC OAO Af\i\ yl AO Af\C A /I 1 A ylOA A A^ 

395 398 400 402 405-406 410 430 442 
459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 
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salivary gland 


Clontech 


SALs03. 


217 254 270 388 610 


skin fibroblast 


ATCC 


SFBOOl 


517 949 


skin fibroblast 


ATCC 


SFB002 


269 688 


skin fibroblast 


ATCC 


SFB003 


3 203 897 907 


small intestine 


Clontech 


SINOOl 


3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
911 913 948 953 959 976 984 


skeletal muscle 


Clontech 


SKMOOl 


15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 


skeletal muscle 


Clontech 


SKMs04 


215 


spinal cord 


Clontech 


SPCOOl 


14 20-21 25 28-29 31 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160 168 172 
176 188 190 205 209 229 232 258 285 
301 308312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 


adult spleen 


Clontech 


SPLcOl 


3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 


stomach 


Clontech 


STOOD 1 


35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 


thalamxis 


Clontech 


THA002 


30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
.437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 


thymus 


Clonetech 


THMOOl 


10 16 20 28-2932 37 41 52 57 66-67 74- 
75 110 118 121 129-131 141 151 159-160 
208 211 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 411-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 


thymus 


Clontech 


THMc02 


1 3 9-11 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 112 115 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
511 517 522 526 528 532 542 551 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 611 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 ^84 686-693 697 713 717 720 . 
728 740 746 749 760-762 767 771 775 
794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 


thyroid gland 


Clontech 


THROOl 


3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
43 1 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 

GAG O AC\ O £^ Ct ^ A G/'Ci G/Tk 0^71 G^ A G^^ 

848-849 862 864 868-869 871 874 876- 

877 8R7 80'^ 804. 80/^ 807 0(17 OnO 017 
Of / oo / oi'j-o"'r oyo-oy/ y\j/~y\jy yiz 

919-921 923 925 928 936 940-942 944 

946-947 950 953 955 958-959 962-963 

967 969 973 981 


trachea 


Clontech 


TRCOOl 


33-34 55.56 69 74 163 172 190 209 212 
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"^fn Tjn 0Q7 "in^ 71 a A^1 /lo/c Ann 
Zo/ A/V zy/ DKIJ djZ mi 4zo-4z/ 

466-467 500 502 504 580 586 610 613 
633 642 688 691 711 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 


Uterus 


Clontech 


UTROOl 


4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 

dOCt AO'X 40fi 41 V 4.9'? A'^^ d'^A d'il AAf) 

462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITV 


1 


L06175 


Homo sapiens 


occurs in MHC class I region; ORF 


308 


98 


2 


Y70775 


Homo sapiens 


Follistatin-related protein zfsta. 


3094 


98 


3 


X15187 


Homo sapiens 


precursor polypeptide (AA -21 to 
782) 


4112 


100 


4 


AFl 10640 


Homo sapiens 


orphan seven-transmembrane 
receptor 


344 


100 


5 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7879. 


158 


72 


6 


W85607 


Homo sapiens 


Secreted protein clone da228_6. 


1477 


100 


7 


Y30162 


Homo sapiens 


Human dorsal root receptor 4 
hDRR4. 


884 


88 


8 


Y15227 


Homo sapiens 


Leul 


391 


100 


9 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


10 


X92106 


Homo sapiens 


bleomycin hydrolase 


2445 


100 


11 


Y15228 


Homo sapiens 


Leu2 


445 


100 


12 


U27838 


~Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


432 


34 


13 


U27838 


Mus musculus 


glycosyl-phosphatidyl-inositol- 
anchored protein homolog 


320 


27 


14 


Y71062 


Homo sapiens 


Human membrane transport protein, 
MTRP-7. 


2323 


99 


15 


U96781 


Homo sapiens 


Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 


5145 


100 


16 


Ml 6653 


Homo sapiens 


pancreatic elastase IIB zymogen 


1435 


99 


17 


Y13398 


Homo sapiens 


Amino acid sequence of protein 
PR0346. 


1749 


99 


18 


Y02283 


Homo sapiens 


Secreted protein clone br342_l 1 
polypeptide sequence. 


1399 


99 


19 


Y53030 


Homo sapiens 


Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 


1371 


100 


20 


AL031320 


Homo sapiens 


dJ20N2.5 (novel protein similar to 
fiicosidase, alpha-L-1, tissue (EC 
3 .2. 1 .5 1 , alpha-l-fucosidase 
fucohydrolase)) 


2597 


99 


21 


B01384 


Homo sapiens 


Neuron-associated protein. 


1876 


100 


22 


Y68778 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-10. 


2470 


100 
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SEQ 
ID 
NO- 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


23 


Y55935 


Homo saniens 


Human PCH^*? nrn+pin 

X XXAXliaXl X\XAiJ^ LflUL&UI. 


47R1 


00 


24 


Y55935 


Homo saoiens 


Hiirnan KT4S^ hrrttprn 

' ' ^'1 1 11*1 1 x<3^ LllWwUla 


9Rfl7 


) C\(\ 


25 


AC024792 


Caen orh abditis 
elegans 


contain*; ^imilaritv tn TR*O0^n9Q 




•? 1 


26 


Y07972 


787 


Human ^ecreterf nrofpin fraornpnt' 




inn 
luu 


27 


X97630 


Homo sapiens 


serine/threonine protein kinase 


3781 


OS 


28 


AF 150755 


Mus musculus 


microtiibuIe-actiTi crosslinkinjj factor 


3514 


uo 


29 


AFl 50755 


Mus musculus 


microtuhule-actiTi rro^slinkinty fartor 




/V 


30 


Z38011 


Mus musculus 


DMR-N9 


9088 


OD 


31 


AJ000522 


Homo sapiens 


axonemal dynein heavy chain 


6058 


99 


32 


AF037256 


M^u*; mu'iciilu'! 


FS2 nrotein 


9960 


01 


33 


S62140 


Hnmn Qanipni 

M.i.\Jl.Xl\J AQL/lwlU 




9017 


lUU 


34 


S62140 


Homo sapiens 


TLS=nuclear RNA-binding protein 


2890 


98 


36 


AB03R9^7 




(5 TVT*r\+t»in— r*rtiiMlo/i T*o^*OT^t/\^ ^"'^T 0 
VJ piULCiIl**uUUDlCU rCOCpiVjl \^JXj£. 


1/0/ 


1 AO 


37 


D79994 




buniiaT lu aiiA.yriii oi v^nruniaiium 




OO 


38 


X63380 


H^mnn cjit^iptic 






CO 


39 


AL022072 


omyces pombe 




lUO / 


0 1 


40 


J03930 


Homo sapiens 


alkaline phosphatase 


2751 


100 


41 


AF139Q68 




PriT-^J. nrnfpin 
piULCJJl 


1 ORR 
lUoo 


OQ 


42 


ALl 17637 


Homo sapiens 


hypothetical protein 


2208 


100 . 




AT 091 IQ'^ 




uiv/^/dz.i ^novci protcm_/ 


1 JZO 


1 OA 


44 


Yfisni 1 

yvuovi 1 


numo sapieiio 




iooO 


1 Art 
100 


4S 




numo Sapiens 


organic cauon iransponer, juyo 




1 r\(\ 


46 


W78245 




Hf^fTinATif' rtr Hum'STi po/^f*ttf'o/4 t^»*/\+oiTi 
JTlagUlCiil Ul UUUlall bCOlClCU prOLcUl 

pTir*nHpH Viv optip 1 0 




1 OO 


47 


Y41765 




HllTTifltl PROlflR'^ nrr\tpin cpniipn/*p 




lUU 


48 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; CLIC4 


1305 


99 


50 


U09413 




^in^* fiTiCTPr T^rrtt"Pifi V M iV 1 
£>UiV.' llligCl piULCUI ^INP 


1 Jul 


J / 


51 


AF061812 


Homn canipnc 


Vprntin 1 ^ 




TOO 
lUU 


52 


W63681 


Homo sapiens 


Human secreted protein 1. 


1326 


99 


53 


AB035303 




vaUIiCIll j- i u 




1 OO 

lUU 


54 


A 12022 




lVir\_r-o 


46j 


1 OA 


55 


AL121897 






1 RA7 
160/ 


1 OO 


56 


Y73330 




n 1 XSJ.V1 L'lunc jy i proicui 


R1 R 
oio 


yo 


57 


AF151018 


Homo <ia'niPTi<i 


■HfSPn 84 




1 nn 

lUU 


58 


AFl 25042 


Homo sapiens 


lii^r»lm<:r»hatp "^'-niiplpntirlftcp 




1 0ft 

lUU 


59 


AFl 18670 


H^omo ^ianiens 


nmhan fr nrntpin-pminlpH rpcpntrtr 


1071 
xy 1 X 


1 no 

lUv 


60 


X04494 


Homo ^atiieni 


nrppiir^nr'nftlvnpritiHp 


X y\jo 


inn 

lUU - 


61 


AF208865 


Homo sapiens 


EDRF 


528 


100 


62 


D15057 


Homo sapiens 


DAD-1 


567 


100 


63 


AF260fifi5 




nibionc acciyiLranbierasc 


1 J lU 


1 OO 


64 


AF260665 


Homo sapiens 


histone acetyltransferase 


1429 


96 


Dj 


A 19771 /I C 


Homo sapiens 


VMifi **AlQfa/l pmnll | 'Linnet D A 13 1 D 

ras-reiatea smaii o i rase kac 1 6 


1073 


100 


oo 




Homo sapiens 


Human secreted protein clone 

dh 1 073 12 protein sequence SEQ ED 

NO: 106. 


AO 

348 


100 


67 


Y82744 


Homo sapiens 


DNA replication and repair 
as<iociflted nrotpin fDl? A SP^ 


1028 


100 


68 


Y44486 


Homo sapiens 


Human GPRW receptor polypeptide. 


1721 


100 


69 


AL031228 


Homo sapiens 


dJ1033B102 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNGIO and C. elegans 
F28D1.1) 


3196 


100 
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SEQ 
ID 

SSyJ. 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

LDENTTTV 


10 


A 1276'? 16 

r\.J £. 1 \jj X V7 


Hnmo *;anien<; 

XX\Jtll\J ^I^LH\jXXO 




17SI 


'59 


71 


Y18314 


T-fomn (iflniPTi^ 


Ti3T5inlpOTn—liVp Tnwfpin 


4146 


00 


72 


AF157028 


Homo sapiens 


protein phosphatase methylesterase-1 


2017 


100 


74 


Y7 1 nR9 

I / l\JO£. 


JTlUlilV/ oa^JCUa 


null loll D'oggTcssivc lympnuujd. 
(BAL) protein. 




OQ 

yy 


1 J 








714 












717 




77 


AFl 08420 


Takifugu 

1 UUl IL/Co 


1-aminocyclopropane-carboxiIate 


733 


56 


78 


G01349 


Homo sapiens 


Human secreted protein, SEQ ID 


650 


99 


79 


ALl 17635 


Homo sapiens 


hypothetical protein 


922 


99 


o 1 


ZJSjyoO 


Homo sapiens 


GJiuojvi i.j t^sunuar to yeast 
suppressor proicin oivrH-u^^ 




11 


R9 
ox. 


API Si.'XAAA 


xiomo aapicnb 


ocuiin-scnsiiivc uiitiaLion lacior za 

Vinncp 


ion 


OO 




1 1 4.1 

VJU J 1 *+ J 




T-Tiimsin cppTPtpH TM"/\tpJTt ^Pr^ 111 
liUIllall aeUlCLCU piUlCUI, o£^V( 

NO* 5724 


4Q<i 
*\yj 


yo 


84 






piri VlTYlSllpilTl 1/lP— CPTlCltlVP TUPtrtl* 

iN'CUijrllllaieilillue dCXldlLl VC laUlUl 


1744 


00 

yy 


85 


Y17791 




VA V? nrntpin 




inn 


87 


AF263538 




giuwLu uiJLiei euiictuuii JLat^iui j 


1044 
iyH*T 


00 
yy 


88 


Y1Q7S7 


TTrtmr* caTHFTic 


^FO in NO 47S frnm WO0099741 


1 Ifil 


1 nn 


89 


AF161493 


Homo sapiens 


HSPC144 


1185 


100 


on 


AFl 6140'? 






8^6 


^(\c\ 


91 


B25780 


787 


Human secreted protein SEQ ID 


647 


41 




T T^7'2/l/l 


Mus musculus 


Meis3 


1 nrtT 
lUU/ 


89 


01 


API ^OQ^/! 


Homo sapiens 


cardiotrophin-Iike cj'tokine CLC 




9S 


94 


AL3901 14 


Leishmania 
major 


extremely cysteine/valine rich 
protein 


223 


29 


yD 




/vrauiaopsis 
thaliana 


contains sunilanty to adenylate 
kinase~gene_id:MCA23 . 1 8 


ZO / 


35 






riumo sapiens 


1797 1 69 1 


J iSjJ 






070007 


fiuiuu Sapiens 


nunian nucicic acia-Dinuing proicin, 
NuABP-l. 




oo 
yy 


98 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


507 


70 


00 

yy 


At 1 / 


Homo sapiens 


TrafZ and NCK interacting kinase, 
spiice variani i 


094z 


99 


100 


LI 1239 


Homo sapiens 


homeobox protein 


717 


100 


1 m 




Homosapiens 


similar to zinc finger proteins; 
similar to AACO 1956 

^^JrJUL/.gZtt^j V 1 V) 


zl54 


no 

98 


102 


AC003682 


Homo sapiens 


R28830 2 


1287 


48 




Atroni CIO 


iVaUUS 

norvegicus 


dynamin ILlbb isoform 




etc 




I lyD 1 V 


riuiuo sapiens 


xiunian carDonyuraic'associaicu 


1 1Q/i 




105 


Y79510 


Homo sapiens 


Human carbohydrate-associated 


1209 


90 


106 


AL096748 


Homo sapiens 


hypothetical protein 


1216 


100 


ins 

1 uo 




Homo sapiens 


^etallothionein 2 




lUU 


109 


AL034422 


Homo sapiens 


dJl 141E15.2 (novel protein) 


433 


100 


1 in 

1 JU 


A Ul Ol IOC 


Homo sapiens 


anaphase-promoting complex subunit 

A 


oo3 


1 Art 

100 


111 


AL021712 


A ra ni f1r*r*ci c 

thaliana 


niitniivp nriTfpin 
UUulLlVe uiutcui 


1 RS 


7A 


112 


AF250138 


Homo sapiens 


small stress protein-like protein 
HSP22 


1063 


100 


113 


AL109976 


Homo salens 


dJ794I6. 1 . 1 (novel protein) 


4176 


99 


114 


Y36151 


787 


Human secreted protein 


668 


100 



129 



wo 01/57190 



PC t/USO 1/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


115 


AFl 10399 


Homo sapiens 


elongation factor Ts 


1666 


100 


116 


AF210317 


Homo sapiens 


facilitative glucose transporter family 
member GLUT9 


2052 


99 


117 


Y73328 


Homo sapiens 


HTRM clone 082843 protein 
sequence. 


931 


100 


118 


X04085 


Homo sapiens 


catalase 


2846 


100 


119 


AF147717 


Homo sapiens 


ubiquitin C-terminal hydrolase 
UCH37 


1695 


100 


120 


X73882 


Homo sapiens 


microtubule associated protein 


3801 


go 


121 


AC004882 


Homo sapiens 


similar to CAA16821 
(PID:g3255952) 


3223 


100 


122 


M93311 


Homo sapiens 


metallothionein-ni 


421 


100 


123 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7908. 


557 


Q4 


124 


G03827 


Homo sapiens 


Hum an secreted protein, SEQ ID 
NO: 7908. 


222 


SI 


125 


AF232009 


Homo sapiens 


peroxisomal trans 2-enoyl CoA 
reductase 


1565 


QQ 

yif 


126 


AB004906 


Ipomoea 
puTDurea 


transposase 


146 


20 


127 


M60165 


Homo sapiens 


guanine nucleotide-binding 
regulatory protein 2 


1832 ' 




128 


YI0319 


Homo sapiens 


carnitine carrier 


1592 


100 


129 


U75467 


Drosophila 
melanogaster 


Atu 


937 


J\J 


130 


Z21507 


Homo sapiens 


human elongation fector-l-delta 


494 


R7 
o / 


131 


Z21507 


Homo sapiens 


human elongation factor- 1 -delta 


938 




132 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


6745 


100 


133 


Y58633 


Homo sapiens 


Protein regulating gene expression 
PRGE-26. 


4818 


95 


134 


M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 


135 


U72970 


Sus scrofa 


calcium/caknodulin-dependent 
protein kinase n isoform gamma-B 


2723 ■ 


yy 


136 


G03213 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 7294. 


450 


100 • 


137 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A 
member 24 


627 


99 


138 


AF155648 


Homo sapiens 


putative zinc finger protein 


5855 


92 


139 


AF144638 


Homo sapiens 


sphingosine-l-phosphate lyase 


2977 


100 


140 


AF152318 


Homo sapiens 


protocadherin gamma Al 


4778 


100 


141 


B08517 


Homo sapiens 


Amino acid sequence of a beta- 
tubulin antigen. 


5841 


100 


142 


X56667 


Homo sapiens 


calretinin 


1410 


99 


143 


X92763 


Homo sapiens 


tafazzins 


1605 


100 


144 


Y95293 

• 


Homo sapiens 


Human GEF containing NEK-like 
kinase substrate sGNK. 


4092 


99 


145 


AP226046 


Homo sapiens 


GK003 


1198 


100 


146 


M22877 


Homo sapiens 


cytochrome c 


554 


98 


147 


AJ272212 


Homo sapiens 


protein serine kinase 


2196 


100 


148 


AB026491 


Homo sapiens 


PICKl 


2114 


9S 


149 


AB018580 


Homo sapiens 


hluPGFS 


1699 


100 


150 


X91868 


Homo sapiens 


sixl 


1509 


100 


151 


AF266505 


Mus musculus 


pseudouridine sjmthase 3 


2135 


84 


152 


U29170 


Drosophila 
melanogaster 


ANON-23D 


883 


43 


153 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8156. 


567 


99 


154 


AY009128 


Homo sapiens 


ISCU2 


138 


100 



130 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIIS 


DESCRIPTION 


SMTTH- 
WATERMAN 
SCORE 


% 

IDENTITY 


194 


B25679 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 1 5 SEQ ID NO:68. 


760 


100 


195 


AB020315 


787 


homologue of mouse dkk-1 gene:Acc 


1466 


100 


196 


U35730 


Mus musculus 


ierky 


2021 


75 


197 


AL136450 


Homo sapiens 


dJ5 1 0O2 1 . 1 (novel protein) 


632 


• 100 


198 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


512 


24 


199 


¥70775 


Homo sapiens 


Follistatin-reJated protein zfsta. 


2027 


63 


200 


X87237 


Homo sapiens 


a-glucosidase I 


4447 


99 


201 


AF101078 


Caenorhabditis 


CLU-1 


1393 


46 


202 


X04571 


Homo sapiens 


precursor polypeptide (AA -22 to 
1185) 


6611 


100 


203 


X00474 


Homo sapiens 


pS2 precursor 


466 


100 


204 


AB029333 


Halocyntliia 
roretzi 


HrPET-1 


974 


54 


205 


AF146019 


Homo sapiens 


hepatocellular carcinoma antigen 
gene 520 


998 


100 


206 


AF071002 


Homo sapiens 


miriK-related peptide 1; MiRPl 


632 


100 


207 


AB038162 


Homo sapiens 


trefoil factor 2 


744 


100 


208 


U30521 


Homo sapiens 


P311 HUM 


363 


100 


209 


AB000911 


Sus scrofa 


ribosomal protein 


782 


100 


210 


AB 021227 


Homo <iflnien<; 

X X\JXm\. \J OOL/XwXLO 


rnembrane-tvne-5 matriv 
metalloproteinase 


3545 


100 


211 


AF 180920 


Homo sapiens 


cyclih L ania-6a 


2722 


100 


212 


AF105365 


Homo sapiens 


K-Cl cotransporter KCC4 


5624 


100 


213 


U29244 


Caenorhabditis 
elegans 


similar to human (TRE) transforming 
protein (PIR:S22 1 57) 


602 


32 


214 


AL033538 


Homo sapiens 


dJ477H23.1 (novel protein) 


3195 


100 


215 


X52011 


Homo sapiens 


muscle determination factor 


1262 


100 


216 


AF083248 


Homo sapiens 


ribosomal protein L26 homolog 


739 


100 


217 


AF006751 


Homo sapiens 


ES/130 


4793 


99 


218 


AB007859 


Homo sapiens 


K1AA0399 protein 


3559 


99 


219 


AK026291 


Homo sapiens 


unnamed protein product 


826 


100 


221 


Y84045 


Homo sapiens 


Splice variant of cancer associated 
polypeptide CH 1 -9a 1 1 -2 . 


5851 


97 


222 


Z67996 


Homo sapiens 


tenascin-R (restrictin) 


7186 


100 


223 


AF134802 


Homo sapiens 


cofilin isoform 1 


846 


100 


224 


Y17711 


Homo *^anien*i 


atoDV related autoantieen CAL.C 


1611 


99 


225 


AF 190051 


GaUus gallus 


hepatocyte nuclear factor 1 a 
dimerization cofactor isoform 


443 


81 


226 


AK026256 


Homo sapiens 


unnamed protein product 


866 


98 


227 


Z69368 


Schizosacchar 
omyces pombe 


•nuf2-like coiled-coil protein 


230 


25 


228 


AF275948 


Homo sapiens 


ABCAl 


11763 


99 


229 


AF161384 


Homo sapiens 


HSPC266 


2006 


98 


230 


Y16270 


Homo sapiens 


paralemin 


1951 


100 


231 


AJ245599 


Homo sapiens 


putative secreted ligand 


2379 


99 


232 . 


W88499 


Homo sapiens 


Human stomach carcinoma clone 
HP10412-encoded protein. 


1545 


99 


233 


AF096286 


Mus musculus 


pecanex 1 


3623 


93 


234 


V64619 cd 

1 ~ 


Homo sapiens 


30-NOV-1990 Human HEl cDNA 


796 


100 


235 


V64619 cd 
1 


Homo sapiens 


30-NOV-1990 Human HEl cDNA. 


470 


98 


236 


AF227258 


Bostaurus 


RPGR-interacting protein- 1 


1262 


38 


237 


AJ132445 


Homo sapiens 


claudin-14 


1181 


100 


238 


AL034562 


Homo sapiens 


dJ684024.2 (prodynorphin (Beta- 


1330 


100 



132 



wo 01/57190 



PCT/USOl/04098 



SEQ 

Tn 
lU 

NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


155 


AF141315 


Homo sapiens 


alplia-l,4-N- 

acetylglucosaminyltransferase 


1842 


100 


.156 


AFl 10645 


Homo sapiens 


candidate tumor suppressor p33 
INGl homolog 


1294 


99 


157 


AF159297 


Zea mays 


extensin-like protein 


238 


25 


158 


AL133325 


Homo sapiens 


dJ984P4.3 (Homeobox protein 
NKX2B) 


1437 


100 


159 


AF073298 


Homo sapiens 


small EDRK-rich factor 2 


294 


ion 


160 


AC004858 


Homo sapiens 


Ul small ribonucleoprotein ISNRP 
homolog; match to PID:g4050087 


4032 


100 


161 


AB012109 


Homo sapiens 


APCIO 


990 


100 


162 


AL162751 


Arabidopsis 
thaliana 


putative protein 


194 


32 


163 


AJ005698 


Homo sapiens 


poly(A)-specific ribonuclease 


3351 


100 

AW 


164 


AFl 17646 


Homo sapiens 


long CBL-3 protein 


2547 


99 


165 


AC004002 


Homo sapiens 


similar to ciliary dynein beta heavy 
cham; 78% Similarity to P23098 
(PID.gl 18965) 


5065 


100 


166 


M10942 


Homo sapiens 


human metallothionein-Ie 


381 


100 


167 


AF126484 


Homo sapiens 


CARD4 


4961 


100 


168 


AF161518 


Homo s^iens 


HSPC169 


1604 


100 


169 


M64983 


Homo sapiens 


fibrinogen beta chain 


2482 


100 


170 


M64983 


Homo sapiens 


fibrinogen beta chain 


2679 


100 


171 


M58514 


Gallus gallus 


fibrinogen beta chain 


1059 


78 


172 


AJF078845 


Homo sapiens 


16.7Kd protein 


786 


100 


173. 


ACO 04774 


Homo sapiens 


Dlx-6 


923 


100 


174 


Z98974 


Schizosacchar 
omyces pombe 


putative vacuolar protein sorting- 
associated protein 


185 


31 


175 


X56203 


Plasmodium 
falciparum 


liver stage antigen 


283 




176 


W74726 


Homo sapiens 


Human secreted protein fg949 3. 


1879 


100 


177 


AJ222967 


Homo sapiens 


cystinosin 


1920 


100 


178 


AC024796 


Caenorhabditis 
-elegans 


contains similarity to TR:076167 


221 


27 


179 


Y66632 


Homo sapiens 


Membrane-bound protein PR0276. 


1370 


100 


180 


AF151803 


Homo sapiens 


CGI-45 protem 


215 


28 


181 


G02694 


Homo sapiens 


Human secreted nrotein SKO TD 
NO: 6775. 


283 


1 00 


182 


Y17292 


Homo sapiens 


Human cell death preventing kinase 
(DPK- 1 ) protein sequence. 


2676 


1 00 


183 


AF234765 


Rattus 
norvegicus 


serine-argmine-rich spUcing 
regulatory protein SRRP86 


148 


27 


184 


AF151855 


Homo sapiens 


CGI-97 protein 


1214 


96 


185 


AF289664 


Mus musculus 


CYLN2 


4673 


90 


186 


AL022238 


Homo sapiens 


dJ104''K10 2 CsuDoorted bv 
GENSCAN, FGENES and 
GENEWISE) 






187 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENEWISE) 


2332. 


100 


188 


X83543 


Homo sapiens 


APXL 


8513 


99 


189 


AF059569 


Homo sapiens 


actin binding protein MAYVEN 


3106 


99 


190 


M18135 


Rattus 
norvegicus 


smooth-muscle flltiha trnnnmvnQin 




If J 


191 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


192 


D30689 


Bacillus 
subtilis 


subunit of nitrite reductase 


113 


29 


193 


Y44984 


Homo sapiens 


Human epidermal protein- 1. 


538 


97 



131 , 



wo 01/57190 



PCTAJSOl/04098 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


roENTTTY 


NO: 


















J — =- TT- 

Neoendorpliin-Dynorphin precursor. 












Proenkephdlin B precursor)) 








ArzozUz / 


— : 

Homo sapiens 


aTT ^ a 9 


CAQ 


lOU 




AT A'70'2/1/1 


Arabidopsis 


putative protem 


J 74 


ii 






thaliana. 










Pi\A)\jZD^H 


Homo SEpieiis 


J r-r — : — — : 

Gene product with similanty to 


1 j4^ 


J 1 








uyneiu ucia aUDunii 








t\JZ 1 1 1 


1 oKlIUgU 


nvnJNJS^ pruicin 










ruuripcs 










A T ri9 1 0 1 Q 


Homo sapiens 


i — i T^- 

b34Io.l (Kmppel related Zinc Finger 


1 Aid 
14/0 


45 








protein 184) 






244 


AF190167 


Homo sapiens 


membrane associated protein SLP-2 


1736 


99 


245 


Y 1060 I 


Homo sapiens 


ankyrin-like protein 


5877 


100 


246 


. AL121771 


Homo sapiens 


dJ548G19.1.1 (novel protem 


3628 


100 








(ortholog of mouse zinc finger 












protein z,rro4 ) (translation oi cxjiNA 
























^isoioim V)) 






247 


L253i4 


Drosophila- 


actin-related protein 


984 


/IT 

47 






melanogaster 








9/1 C 


AO J /4D 


Homo sapiens 


TsXJcXi receptor 


1 AQ^ 


1 AA 




At 1 IZZUiS 


Homo sapiens 


I3kDa differentiation-associated 


olO 


1 AA 








protein 








ADAAI 7n9 

/xrVUl f\} 1 


— — ; 

Homo sapiens 


human gene for claudin-8. Accession 


I 1 99 

II IZ 


1 AA 








A 19 1 






9^1 
ZD 1 


AT 1 '^/^19^ 


"tj = 

Homo sapiens 


HT'3n4Tn4 1 i-»r'/\+Ain\ 

uj jutoi't. 1 ^^novci proiein^ 


77 (! 
/ /o 


inn 


9^9 


AT A'^ 1 1 8/? 


Homo sapiens 


dpo'o4vji.i ^supponea oy runiNiio^ 




1 AA 


9^'5 
ZD J 


V17^7 1 
I 1 / Jj 1 


Homo sapiens 


ji urn oil sccreieu proiein cione dLiZUD 




1 AA 








14 T^rrtt^iti 
IH |JJ ULClll. 






9Sd 


AT 04084'^ 


riom.o Sapiens 


^1*^09^1 7 'K An'^40 rtrntpin^ 

ujjyzivii /.J \^rwiA/v\/j^y proicin_^ 


D 1 


00 

yy 




A T949Q79 


xiouio Sapiens 


TOT T TP r^rfttAtn 


1494 
If Z*l 


00 

yy 


9S/^ 




xiomo Sapiens 


riumou proiein cione rxr vzo^^. 


lo /o 


inn 

lOU 


7S7 

Z> J / 




T-Tr\TYi i*\ com ATI c 


KliICalU~ill\.C pruivill UrVTkJ.!^ 


900'^ 
zy\jj 


inn 


9^R 


AT 0944051 




HT417\yll4 1 TrirtvAl rirntAin^ 

ujH- 1 /ivii*T. i ^QUvei pruLcmj 


^RO 

Doy 


1 nn 


9^0 


JSSjOZ 1 0 


xiomo Sapiens 


1 neropcuiic pojypcpuae ixom 


R'^ft 
oJU 


1 nn 








^n 1 4^^/\n^^ ^Al 1 1 m A 

gUODloSlUma ceii lUlc. 






9/!;ft 
zOv 


AT71 m 7C4 


_ : 

Homo sapiens 


D-iKL/jr vanani iiJxvo-iKappats 


•299^ 
DZZKi 


CO 




Ar lUl /o4 


Homo sapiens 


D- 1 KUr variant njivo-iKappaiD 


9091 

zozl 


1 AA 

lUU 


9A9 


AT71 m 7G/I 

ArlUl/54 


.Homo sapiens 


D- 1 KUr variant xiJKo-iKappajD 


J 14y 


QO 

yy 


9*^*3 
ZOJ 


AT71Q9AAn 

Ar ly /UOU 


Homo sapiens 


src homology 3 domain-containing 


99^9 
ZZD 1 


1 AA 








protein xiur--) J 






9Ad 


I oOZOZ 


Homo sapiens 


Human secreicG protein riAv^/VKZj, 


/DO 


1 AA 








9Fn TT) "MO- 177 










T-l rtm /*\ ctmiAnc 
XXUlllU oapiClla 


T^iimnn 9T^P^APT nnlunpntiHp 
nuiuaii oxjf o/vr i-r puiypcpiiuc. 


9770 
z 1 ly 


inn 

i\f\j 


266 


Y56966 


T-Tnm ■ ca r»i PTi c 
x^jLUiUU oa^xcxio 


T^iiTnan 9TiP^APT nnlvnpntidp 
fjLUiiiciii oijxo/vri^ pcjypepiiue. 


1 V/ 10 


yy 


967 


A 1*^0046^ 


rfr\mr\ cnniAnc 
n.uuiU oapicio 


mituhvp wnitf* fiSTTiili/ ATTP— KiriHino 
uuiaLive wiuic Loiiiiiy -uuiuiiig 


1 SS7 


0^ 

yj 
















Apnn4n'?n 


xiuuiV/ oa|/iviio 


T?718Sfi 9 

I. ^ XOJ\J 




00 

yy 






numo sapiens 


T-TT 9^ i*iV\ncnTnol n»*rt+Am 

txX-tZD rjuusuinoi proiein 


714 
/ IH 


1 nn 

lUU 


97n 


AQn^-3Q91 


ivius muscuius 


iNUTi reiaLcu proicjn inut^ 


1 RS^ 
i 0 J J 


04 
yn 


971 


AFORlRRfi 


TTr^TTi cimiAnc 
nuilii/ CMipieilo 


PROl-liVp nrn+pin 


l"vJ 


00 

yy 


111 


AFl^^(il09 
/ur x\j\}^yz 


H^fMT^rt dmiATlC 




1 UUU 


inn 




AT 0999*3(2 


Homo sapiens 


ajiU4Zjviu.4 ^novei proieinj 


99m 


1 AA 

iUu 




WRR/^fi7 




Qppt'pI'pH nrnt'Pin An/>r\HpH V^v D'Atip 

oeucicu piuLeui cKiuuucu ujr gcuc 




00 

yy 








134 clone HAffiP89. 






275 


X00129 


Homo sapiens 


precursor RBP 


1044 


97 


276 


Z47500_cdl 


Homo sapiens 


1 l-MAY-1998 Human RHOH gene 


1161 


100 








sequence. 






277 


AB049188 


Equus caballus 


ubiquitin C-terminal hydrolase 


1118 


96 



133 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 

IkTT tIVin>I7I> 

NUMBLK 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTTTY 


278 


AF270647 


Homo sapiens 


GTTl 


1564 


100 


279 


AF143956 


Mus musculus 


coronin-2 


2414 


94 


280 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


911 


92 


281 


R85151 


Homo sapiens 


Endothelial cell polypeptide. 


1031 


100 


282 


D83948 


Rattus 
norvcgicus 


S 1 - 1 nrotein 


3975 


on 


283 


Y14768 


HoiDO sapiens 


I Kanna. B-like nrotein 


2037 


mo 


286 


AL031316 


Homo sapiens 


dJ28O10.3(HSDllBl 
(hydroxy steroid (1 1-beta) 
dehydrogenase 1) 


294 


100 


287 


D64109 


Homo sanien<i 


toh familv 

^x^L/ Lu4J.Hl Y 


1773 




288 


AB026043 


Homo sapiens 


MS4A7 


1230 


100 


289 






lfriiPr»npl-rpl5itpH T^MA-hinHintJ 

protein 






290 


AJ001810 




mRNA flpj^vaup fartrir T 9S VDfl 


1717 


inn 


291 


Y99454 


Homo *5ar)ien<» 


Human PRO1605 /TrN078fi"li amino 
acid sequence SEQ ID NO:395. 


694 


inn 


292 


Y44824 




HiiTTian mnlpnilp ncQAriafpH witVi r^pll 

proliferation, MACP-4. 


2370 


inn 


293 


AJ276101 


Homn <iflniPTi<; 

i.i.\Jxxwj ij\*\J xy^ixo 


GPRC5B nrntein 


7099 


ion 


294 


AF161406 


Homo s^iens 


HSPC288 


719 


100 




2' JOO^O 




jrroiciu rcguiaiing gene expression 
PRfiF-91 


IZ /O 


1 AO 


296 


U91561 


norvegicus 


TwriHriYinp ^'-■nViAcnViatp nviHncp 




0 1 


297 


L02956 


yvwuu LJUo 

laevis 


n hnn n f 1 pr\nmtpi n 

1 iUUllUVfievUI ULCili 


1 UZ.*T 


OJ 


298 


AF226730 




-Cytl9 


1 779 


00 

yy 


299 




Homo sapiens 


Cytl9 


906 


98 


300 


Y54324 




gastric cancer antigen protein. 


/ i o 




301 


AF 125533 




i<if>fnrm 


1 ^n^ 

1 uuu 


inn 


302 


Y32206 


Homo *;anieTi*? 


Human rerpntormnlpnile rRTT^^ 

encoded by Incyte clone 2825826. 


1676 




303 


AF247565 


Hnmn QaniPTiQ 

X x\JxXl\J doL'iwlld 


rinp fin per nrotein 


"57 S 


inn 


304 


AF208844 


Homo sapiens 


BM-002 


428 


100 


305 


AC004983 


Homo sapiens 


similar to PID:g3 877944 


1988 


100 


306 


AL 132978 


Arabidonsi*; 
thaliana 


niitative nrofeln 


210 


7S 


307 


Y10530 


Homo sapiens 


olfactory receptor 


1645 


100 


308 


AFl 80681 


Homo laniens 


Piianine nucleotide PYchanpe "factoT 


3597 


ion 


309 


AF111856 


Homo ^aniens 


sodium denendent nho<;nhate 

iJ\J \X 1 Mill VlwUwllUwXl-L uXlV/OLJUOLW 

transporter isofonn NaPi-3b 


3591 


9Q 


310 


Y13583 


Homo sapiens 


G-protein coupled receptor 


2171 


100 


311 


Z73420 


Homo sapiens 


cE146D10.2 (mercaptopyruvate . 
sulfurtransferase (EC 2.8. 1 .2)) 


1598 


100 


312 


X79535 


Homo sapiens 


beta tubulin 


2348 


100 


313 


AF070658 


Homo sapiens 


HSPC002 


861 


100 


314 


AF078866 


Homo <ianien*5 


SURF-4 


1395 


100 


317 


237986 




TilipnvlsillfA/lsirniiiP tiinHinff rvmtpiTi 




1 nn 


320 


AB047892 


Macaca 
fascicularls 


hypothetical protein 


258 


82 


321 


Y25755 


Homo sapiens 


Human secreted protein encoded 
from gene 45. 


1440 


100 


322 


AB016531 


Homo sapiens 


PEX16 


1741 


100 


323 


AL391141 


Arabidopsis 


putative protein 


274 


49 



134 



wo 



01/57190 



PCT/USO 1/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATB3RMAN 
SPORF 


% 

mENTTTY 






thaliana 








325 


AF140501 


Homo sapiens 


DNA polymerase iota 


3691 


99 


326 


X96698 


Homo sapiens 


D1075-like 


1450 


96 


327 


AF 152325 


Homo sapiens 


protocadherin gamma A5 


4769 


TOO 


328 


AF151803 


Homo sapiens 


CGI-45 protein 


1970 


100 


329 


X74070 


Homo sapiens 


transcrintion factor RTF*? 


639 


O I 


330 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


331 


W54040 


TTnmfi Qfir*ipTic 


HIFI. 


A9LA 


OQ 

yo 


332 


AF0246 1 7 


jnLV/JLUU duUlCllO 


protein 


oy 1 




J J J 




Rntfiic 

Ti nrvpoi ri i q 


xVaUllw 


9100 


OA 


334 


G03877 




Human tip-i*rt^t^f^ r^mtf*^r\ RPO TTI 

xxtUUOll DC^ldCU UlvLClilj ^Ct\^ ii J 

NO: 7958. 




1 on 


335 


AL008582 


Homo sapiens 


bK223H9.2 (ortholog of A. thaliana 
F23F1.8) 


626 


100. 


336 


AFl 10774 


JLJLWXXXVr JuLf^WXlO 


arlrptial olnnrl nrntpin ATl-fini 




inn 


337 


AB011414 


HoTTin ^r>ipTi<i 


TCnmnpl-fvnp Trnc finopmrrttpin 






338 


AF207600 


Homo sapiens 


ethanolamine kinase 


129 


100 


340 




tfaaliana 


UUUlLl VC 

phosphoribosylformylglycinamidine 




^o 


341 


Y28576 


Homo sapiens 


Secreted peptide clone p)e503 1. 


944 


100 


342 


U32274 


Saccharomyce 
s cerevisiae 


Ydr386wp; CAI: 0.12 


191 


37 




An 1771 


c\ tn Tn of"! 

synuiciic 
construct 


' — i = i — = = 

vascular anticoagulating protein 


lool 


yy 






xiujno Sapiens 


uncnaracienzeu nemaiopoieiic 
MDS032 




1 An 


345 


Y70400 


XXWIXIW 3CJ.L/iV'Xl<3 


T-TiiniflTi f*pll-Qionalliri(T r^mtpin-O 




inn 


346 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 

VP 16 1 HpWvpH Tvrotpin 


962 


100 


347 


AFl 83428 


Wfimn ^anipn^ 

XXUXXIW OCXlJlk>113 


2R 4 kDa nrotein 

X>U.^ AJ^a LIXUICXXX 


1 '^9Q 
1 j^y 


1 nn 


348 


AC006069 


Arabidopsis 

Lliailcuia 


putative cleavage and 
puiyaucnyitiiion spcLiiny loctor 


1383 


55 


349 


AL032631 


Caenorhabditis 

plpcnnc 


Y106G6H.8 


194 


39 


350 


T 770669 




Pao*'llgaxxU cuaUwialCU lawLui D 




ZJ 


351 


Y93468 




.rUXlliXU avlU aCl^UCXXl^ Ol d. pUulSoiUiXl 

channel interactor protein. 


1 1 R"? 


00 

yz 


352 


AF005856 


yakuba 


tinrtn'? A ^ 
Oxi\JilX.r\,j 


111 
111 




353 


AJ271684 


T-IrYmf* Q?inipnc 


mvplniH nAPl 7— aQcrvpisitino Ipprin 
llky^l^Jl\^ xjr\x xx cudULfiaLXXXg xcuiixi 




inn 


354 


AF099100 


Homo sapiens 


WD-repeat protein 6 


2882 


99 


355 


U51730 


A^^iinnp 

IVILU XliC 

leukemia vini^ 

XwLlJVwXXiXu V XJ VLJ 


ivvciac UaXXoUlxpiuov 


J lO 


40 

'tZ 


356 


D50617 


Saccharomvrp 

Ljowvxxcu \/xxx y vv 

s cerevisiae 


YFL042C 


97Q 
^ ly 


97 


357 


D50617 


Saccharomyce 


YFL042C 


279 


27 


358 


AF161432 


Homo sapiens 


HSPC314 


1059 


93 


359 


AB029488 


Homo sapiens 


CllorfZl 


758 


99 


360 


AJ251024 


Homo sapiens 


putative odorant binding protein ag 


1239 


100 


361 


U43281 


Saccharomyce 
s cerevisiae 


Lpg22p 


2074 


74 


362 


U43281 


Saccharomyce 
s cerevisiae 


Lpg22p 


2153 


74 
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SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


rDENTlTT 


NO: 








SCORE 




363 


AC007153 


Arabidopsis 


100632 


156 


24 






thaliana 








364 


AF197927 


Homo sapiens 


AF5q31 protein 


3992 


99 


365 


D28500 


Homo sapiens 


mitochondrial isoleucine tRNA 


4286 


98 








synthetase 






366 


X97868 


Homo sapiens 


arylsulphatase 


3141 


98 


367 


AL 162048 


Homo sapiens 


hypothetical protein 


1532 


100 


368 


L36062 


Mus musculus 


steroidogenic acute regulatory 


189 


25 








protein ■ 






369 


AFl 13249 


Homo sapiens 


multiple domain putative nuclear 


1022 


59 








protein 






370 


M15888 


Bos taunis 


endozepine-related protein precursor 


2425 


84 


371 


X66363 


Homo sapiens 


serine/threonine protein kinase 


2562 


100 


372 


W74802 


Homo sapiens 


Human secreted protein encoded by 


1532 


89 








gene 73 clone HSQEL25. 






373 


AFl 00772 


Homo sapiens 


tenascin-Ml 


11535 


99 


374 


. AF090934 


Homo sapiens 


PRO0518 


382 


100 


375 


AB021643 


Homo sapiens 


gonadotropin inducible transcription 


2761 


Q9. 








repressor-3 






376 


AB049758 


Homo sapiens 


MAWD binding nrotein 


1331 


■ 1 on 


377 


AF070666 


Homo sapiens 


Kruppel-associated box protein 


466 


97 


378 


S59342 


Mus sp. 


nuclear pore complex glycoprotein 


464 


60 








p62 






379 


AFl 49205 


Mus musculus 


Su(var)3-9 horaolog Suv39h2 


1690 


88 


380 


AF227906 


Homo sapiens 


UDP-glucose:glycoprotein 


7851 


99 








fflucowltransfera^e 5 nrecur^nr 






381 


AFl 18566 


Mus musculus 


hematopoietic zinc finger protein 


1769 


92 


382 


AK000619 




linnflmpH rvrntpin nrnHiir't' 


Sin 


1 nn 


383 


AF227906 


Homo sapiens ■ 


T IT) P- P III cn se ' p 1 vnnnrntfiin 


7851 


vy 








0'liif*rt<!vllTfin<ifprflQP 0 nrpmr^nr 

glUwUOjr ILl LUl^i.^iaO^ £r LflVwLuAUl 






384. 


AFl 17946 


Homo saniens 


Link guanine niiclentide CYrhanof* 


2363 


1 nn 








factor II 






385 


AF125390 


Drosophila 


L82G 


139 


41 






melanogaster 








386 


Y94907 


Homo sapiens 


Human secretpd nrAt£*in filrinf* 




sn 








ca106 IQx nrntpin ^ipniipncp ^PO TD 












NO:20. 






387 


U 18795 


Saccharnmvce 


Yel064cn 




9R 
Zo 






s cerevisiae 








388 


AFl 773 88 


Homo sapiens 


cancer-amDlified transcrintinnal 


10748 


00 








pnaptivator 






389 


AJ002744 


Homo sapiens 


UDP-GalNAc-nnlvnentide N- 


3469 


OA 








acetvl&alacto<»aminvltran^f"prfi<ip 7 

uki-wb V x&mttwwijciiiijji y lu ftii uOw / 






390 


AF097366 


Homo sapiens 


cone sodium-calcium notas^ium 


3166 


100 








exchanger 






391 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 


5337 


60 








molecule 






392 


U81035 


Rattus 


ankyrin binding cell adhesion 


3967 


91 






norvegicus 


molecule neurofascin 






393 


X65224 


Gallus gallus 


neurofascin 


4097 


78 


394 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 


4292 


99 








-19 to 4525) 






395 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


396 


AB017026 


Mus musculus 


oxysterol-binding protein 


2173 


98 


397 


AL035587 


Homo sapiens 


dJ475N16.4(KlAA0240) 


2393 


100 


398 


W74813 


Homo sapiens 


Human secreted protein encoded by 


722 


92 








gene 85 clone HSDFV29. 






399 


Y71110 


Homo sapiens 


Human Hydrolase protein-8 


1637 


99 








(HYDRL-8). 







136 
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PCTAJSO 1/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


Ann 




elegans 


contain<s ^limilaritv to luous L.A 
protein homologs 


325 


43 


dni 




lV4^ptlinnf*thprm 

IVXCUiullU LLl&l ill 

ohacter 

thermoautotro 
phicus 


con<ierved nrotein 


231 


36 


402 


Y27795 


Homo sapiens 


Human secreted protein encoded by 
gene No. 79. 


1539 


99 


403 


Z50853 


Homo sapiens 


CLPP 


615 


100 


405 


X03475 


Blattus 
norvegicus 


ribosomal protein L35a (aa 1-1 10) 


576 


99 


406 


AF144237 


Homo sapiens 


LOMP protein 


252 


44 


407 


U20239 


Mus musculus 


fibrosin 


288 


76 


409 


AL033378 


Homo sapiens 


dJ323M4.1 (KIAA0790 protein) 


6026 


99 


410 


X54326 


Homo sapiens 


glutaminyl-tRNA synthetase 


7577 


99 


41 1 


X61585 


Bos taunis 


polynucleotide adenylyltransferase 


3715 


97 


412 


AF217190 


Homo sapiens 


MLELl protein 


5271 


99 


414 


G02815 


Homo sapiens 


Human secreted protein, SEQ ED 
NO: 6896. 


314 


95 


415 


AJ245922 


Homo sapiens 


alpha-tubulin 8 


2370 


100 


416 


AF203032 


Homo sapiens 


neurofilament protein 


220 


21 


417 


Z97653 


Homo sapiens 


c380A 1.2.1 (novel protein (isoform 
1)) 


1567 


100 


418 


AJ404326 


Homo sa'Diens 


SR+89 


1871 


99 


419 


AJ404326 


Homo sapiens 


SR+89 


902 


64 


420 


AF 134726 


Homo saoiens 


G9A 


5334 


99 


421 


L28125 


Podospora 


beta transducin-like protein 


288 


39 


422 


W21733 


Homo sapiens 


NIP-1 encoded by clone 59. 


110 


72 


423 


S67970 


Homo lanien*; 


ZNF75=KRAB zinc fmeer 


951 


76 


424 


L28035 


Mus musculus 


protein kinase C gamma 


3768 


98 


426 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


555 


56 


427 


Y73373 


Homo sapiens 


HTRM clone 921803 protein 
sequence. 


266 


49 


428 


X61118 


Homo sapiens 


TTG-2a/RBTN-2a 


876 


100 


429 


Z96932 


Homo sapiens 


nuclear autoantigen fo 14 kDa 


496 


83 


430 


AJ277291 


Homo latiiens 

X X\/XXXv/ i3CEwXvXX^ 


HELG nrotein 


678 


72 


431 


X82157 


Homo sapiens 


hevin 


3525 


99 


439 




T-ToTno cnniPTiR 


P85B HUMAN- PTDINS-S- 
KINASE P85-BETA 


3825 


99 


433 


AL021918 


Homo sapiens 


b34I8.1 (Kruppel related Zinc Finger 
protein 184) 


1713 


50 


434 


AF084464 


Rattus 
norvfttHcus 


GTP-binding protein REM2 


141 


29 


435 


AL049795 


Homo sapiens 


dJ622L5.2 (novel protein) 


1756 


98 


436 


M14513 


Rattus 
norvegicus 


(Na+ and K+) ATPase, alpha(III) 
catalytic subunit 


4269 


99 


437 


U33460 


Homo *;aDiens 


DNA-directed RNA polymerase I, 
largest subunit 


8777 


98 


438 


D87076 


Homo sapiens 


similar to human broniodomain 
protein BR140(JC2069) 


3067 


100 


439 


L43912 


M^acaca 

X tX U V Uw£^ 

mulatta 


mannose-liiridine Drotein A 


589 


93 


440 


D31763 


Homo sapiens 


lia0946 protein is Kruppel-related. 


927 


49 


441 


U70976 


Homo sapiens 


arrestin 


2068 


99 


442 


B08069 


Homo sapiens 


A human beta-alanine-pyruvate 
aminotransferase (HAP A). 


2343 


99 


443 


AF100662 


Caenorhabditis 


contains similarity to ubiquitin 


166 


24 



137 
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SEQ 
ID 
INU! 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORl. 


% 

BOENTrrY 








wcu uuAyi'LCJiuuicu jjyuiujabc iaxu. 

UCH-l.hmm, score: 28.46) (Pfam: 

lJCH-9 hmm ?rorp- 47 '^V\ 






444 


D78017 


norvepicus 


NFI-Al 




OR 


445 


AL049569 




dJ37C10 3 Cnovel ATPase") 


2418 


inn 


448 


AJ242540 


Volvoy cartfiri 

V v% V wtll Vwl I 

f. nagnriensis 


hYdrowTiroline-rich plvnonrntein 
DZ-HRGP 


165 ■ 




449 


AJ133352 


Homo sapiens 


ZNF237 protein 


2006 


I no 


450 


AJ133352 


Homo sapiens 


ZNF237 protein 


1025 


96 


451 


AF170708 


Homo <yiryipn*i 


T-box nrotein TBX^ 


3700 


OQ 


452 


AK002080 




imnampH ■nrntpin iirAHiirt 


1546 


00 


453 


L32977 


Homo <ianien*i 


Pipclfp pp-^ nrntpin 


1939 


0"? 


454 


X51760 




71T1P finopr nrntpm /^SR^ A A\ 

£iXxX\^ UllgCl LJlULClil ^JOJ 




J / 


455 


Y01141 


Homo sapiens 


Secreted protein encoded by gene 7 
clone HTT FA9n 

^iVilC txLLjx 


1453 


99 


456 


AB006631 


Homo sapiens 


The human homolog of mouse Cux-2 


6559 


100 


457 


AF067165 


Homo sapiens 


zinc finger protein 3 


977 


64 


458 


AF038169 


Homo sapiens 


unknown 


154 


38 


459 


W75214 


Homo sapiens 


Human secreted protein encoded by 
gene 19 clone HRSMC69. 


1180 


95 


460 


U97002 


Caenorhabditis 
eiegans 


similar to acyl-CoA dehydrogenases 
and epoxide hydrolases; Pfam 
aomam rrUU44i (.AcynjoA dnj, 
Score=57.4, E-value=1.7e-16,N=2; 
contams smiilarity to Pfam domain 
irruu/uz ^xiyaroiase^, ocore— j/.^, 
F-valiip=:lp-l^ M=l 

d VCllllC iC"lJj IN i 


583 


37 


461 






ULUloXllCU piUlCUi pjuuui^i 


1 (\A 7 
1 Uh 1 


OQ 


462 


M93134 


Friend murine 

IpiiivPmia vimQ 


pol protein 


289 


44 


463 


AF055473 


Homo sapiens 


GAGE-8 


232 


47 


466 


Y51415 


Hnmr* csirvipnc 


xxmiioii wiiu ijpc pjvcoj piULciii. 




1 Oft 


467 


Y51417 


787 


Human pKe83 splice variant protein 


2433 


100 




I J 17 jO 


tiomo sapiens 


xiiunan uansmcmurane proicin 
HTMPN-60. 






469 


D38552 


Homo sapiens 


The hal539 protein is related to 
cyclophilin. 


2995 


. 100 


Aid 




Homo sapiens 


Human Protease and associated 
protein-7 (PPRG-7). 


'i con 
3530 


100 


All 




Homo sapiens 


C-tenninal variant of hINADL 
including 2 amino acid exchanges 
and an insertion of 28 amino acids in 
frame. 




100 


479 




riouio sapiens 


riuiiiaii secrcicQ pruiem cione 


1 'iA/i 




473 




T^ntTir* csinipnc 


rTTiimsiTi cPi^rptpH ■nfi^tPin /*lf^Tiia 

diil57 17 nrntein 


OQR 
yyo 


OS 


474 


X63526 


HoTTio ^anieni 


Hnmnlnonip tn plnnoatinn fn/*tnr 1- 
xiujiivXvgUw lu wxuxxgauuxi xawiui x 

gamma from A.salina 




yy 


475 


X15940 


Homo sapiens 


ribosomal protein L31 (AA 1t125) 


644 


100 


476 




XTUlllU actpiCIia 


alnVm 0 tx/r^^ \/TTT /*/\1lQA^n 
dipiia-^ ^yP^ Vlii COliagcn 


jJol 


OO 

yy 


477 




T^Ann/\ cfimpnc 
nuiiiu Sapicilo 






on 
y 1 


478 


AF 156929 


Su<; *;cTofa 

^JUO Owl V/X-(X 


inflammatorv re^on^ie nrntpin ft 


1588 




479 


AF264in 


Homo sapiens 


FYVE domain-containing dual 
specificity protein phosphatase 
FYVE-DSP2 


5610 


99 


480 


AF044578 


Homo sapiens 


putative DNA polymerase; P0L4P 


2478 


94 


481 


X89750 


Homo sapiens 


TGIF protein 


1413 


100 



138 
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PCT/USOl/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
dCUKC 


% 

IDENTITY 








jUlSJAj'lJUiyialC 

H vH rn oer\ acp 






483 


U58334 




BbD/53BP2 


1556 


HI 




AF1 51 '5'?8 




H(*f>YvrvtiHvl Iran cfpra CP* Rpvln 


4281 




485 


Z98884 


Homo sapiens 


dJ467Ll.l (KIAA0833) 


699 


73 


too 






\j ll^Upill Cli ill *7H 






487 


Z11737 


Homo sapiens 


flavin-containing monooxygenase 4 


2969 


100 






Nlus musculus 


talin 




/ / 




A T07C1 1 O 


Momo sapiens 


puiaiivc Ceil cycle conuoi proiein 




7Q 




W /HOt-j 


nomo Sapiciis 


LJiin^4T% na^rA^a^ nv^^Am An#^^^A^ nxr 

riumaQ sccreicu proiein encoucu uy 




ys 


491 


Y41337 


Homo sapiens 


Human secreted protein encoded by 

gCIlC j\} i.'lUllC rxfvL/i_/ V H-/ . 


509 


36 


492 


X90530 


Homo sapiens 


ragB 


1926 


99 


AQ'X 




Homo sapiens 


ragts 






HyH 




Homo sapiens 


ragtj 


loyj 


OA 


495 


AL022394 


Homo sapiens 


dJ511B24.3 (KIAA0395 (probable 
homeobox protein)) 


4990 


99 


HyO 




— — : 

Homo sapiens 


lanthionine synthetase C-like protein 

1 


- ZiOo 


JUU 






Homo sapiens 


KiDosomai protem Kmase rs (^iOJs.-x5 ) 


4UU1 


1 AA 


498 


G01563 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 5644. 


330 


100 


499 


X54131 


Homo sapiens 


protein-tyrosine phosphatase 


10465 


99 


500 


GO 1082 


Homo sapiens 


Human secreted protein, SEQ. ID 
NO: 5163. 


549 


100 


501 


AC004142 


Homo sapiens 


similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
iniciaCiions, /o simiianty lo 
D49802 (PID:gl369906) 


3676 


100 


502 


ALl 17544 


Homo sapiens 


hypothetical protein 


1226 


100 


503 


AF203032 


Homo sapiens 


neurofilarnent protein 


5115 


99 


504 


AL034417 


Homo sapiens 


bK2 1 5D 1 1 .2 (similar to rat gene 33) 


2476 


100 


505 


X69090 


Homo sapiens 


190kD protein 


7546 


99 


506 


U58755 


Caenorhabditis 
elegans 


coded for by C. elegans cDNA 
ylc34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded tor 
by C. elegans cDNA yk46d5.3; 
coded tor oy c elegans CUNA 
ykl3fl0.3; coded for by C. elegans 

CJJINA yKJ'tDl.j 


782 . 


55 


jU / 




— — : 

Homo sapiens 


p>irirz procem 


oUi 


lUU 


jUo 




Kattus 
norvcgicus 


cytoplasmic dynein intermediate 
cnoin ZD 


^OA 1 


an 

y / 


509 


AF063231 


Mus musculus 


cytoplasmic dynein intermediate 
chain 2 


3159 


97 


510 


AF20289^ 






4336 




511 


Y13115 


Homo sapiens 


serine/threonine protein kinase 


5071 


99 


512 


AB030207 


Homo sapiens 


G gamma subunit 


364 


100 


513 


AF039571 


Homo sapiens 


peripheral benzodiazepine receptor 
interacting protein; PBR-IP/PRAXl 


495 


33 


514 


AB037883 


Homo sapiens 


Gb3/CD77 synthase 


1916 


99 



139 



wo 01/57190 



PCT/USOl/04098 



SEQ 
10 
NO: 


ACCESSION 
iVUlvIBER 


SPECIES 


DESCRIPTION 


SMTTH- 
WATERMAN 


% 

IDENTITY 


515 


D90868 


Escherichia 
coli 


similar to 


1489 




516 


X98834 


Homo sapiens 


zinc finger protein Hsal2 


5290 


100 


517 


AF055668 


Mus musculus 


apoptosis-linked gene 4, deltaC form 


2904 


78 


518 


AF019926 


Mus musculus 


protein kinase 


1694 


90 


519 


M34513 


Homo sapiens 


omega protein 


317 


Q1 


520 


Y08612 


Homo sapiens 


881cDa nuclear pore complex protein 


2313 


yy 


521 


Y08612 


Homo sapiens 


88kDa nuclear pore complex protein 


1561 


QQ 

yy 


522 


AL096766 


Homo sapiens 


dA59Hl 8. 1 {KIAA0767 protein) 


2497 


100 


523 


AF 186249 


Homo sapiens 


six transmembrane epithelial antigen 
of prostate 


1790 


100 


524 


AB029012 


Homo sapiens 


KIAA1089 protein 


4933 


100 


525 


AB026893 


Homoi sapiens 


vascular cadherrn-2 


5962 


100 


526 


X74331 


Homo sapiens 


DNA primase (p58 subunit) 


1720 


100 


528 


AC007228 


Homo sapiens 


R31665 2 


1488 


47 


529 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit 
prcproicin 


2639 


100 






elegans 


ixjueu lor oy l*. ciegans ci-'iN/v 
ykl72e6.3; coded for by C. elegans 

wlWgCUlO Ki>J^li/ik jiXkJOLl ,Jy ^UUCU lUi 

by C. elegans cDNA yk:172e6.5 






531 


S76838 




Dbs 




oo 


532 


Z82215 


Homo salens 


dJ6802.2 (myosin, heavy 

puiypcpijuc 7, iiuii*lJlU2>Llc^ 


9828 


100 


533 


AF245505 




5iHlif*5in 


977 
LI 1 


J 1 


534 


AF300612 


f-Tnino <;anipnQ 

i.ji.\jxxx\f dauiwiio 


•su 1 fotran era*;e 


001 




535 


AL121928 


Hoino S3pi6ns 


bAl 8114 3 f^nleck<rtrin Rtid Sec7 
domain protein) 




00 


536 


AJ271055 


Mus musculus 


iroc|uois homeobox protein 6 


1724 


76 


537 


AFl 80473 


Homo sapiens 


Not2p 


2267 


100 


538 


AF071059 


Mus musculus 


zinc finffer RNA binding nrotein 


1089 


• 51 


539 


AF023453 


Homo sapiens 


actin-related protein 3-beta 


2219 


100 


540 


AC003030 


Homo *ianien<i 


R29828 1 




7A 
/u 


541 


AC003030 


Homo sapiens 


R29828 1 


2294 


100 


542 


AL121889 




(1J1076F17 1 CKTA AnS?"? nrntpin 
^^continues in AL02'?80'?"1') 


91 S9 


ior» 


543 


AB006135 


Rattus 
norvegicus 


db83 


1238 


98 


544 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 6731. 


644 


97 


545 


Y07595 


Homo *iflTiiens 


tran^irrintinn fartfir TFTTH 

U CUloW ipUUlI lclvLl.fi 11 111 1 


LJ 1 D 


inn 


546 


AL133545 


Homo *;anien*i 


HA^RfiW14. 1 /"nnvpl nrntpin Qimilnr 

\jr\.J*J\Jl^ Irr. 1 ^llUV&l UlULwIll aiiiiiicu 

to a dual specificity phosphatase) 


064 




547 


X83618 


Homo sapiens 


h vdrox vm eth vl ? 1 utarvl -Co A 
synthase 


2647 


inn 


548 


AF134726 


Homo sapiens 


NG37 


4359 




549 


AB035356 


Homo sapiens 


neurexin I-aloha orotein 


6948 


QQ 

yy 


551 


AB037901 


Homo sapiens 


gene amplified in squamous cell 
carcinoma-1 

wui wxxxvyixio X 


5215 


99 


552 


AB043634 


Homo sapiens 


PAR-6A 


885 


TOO 


553 


AP000693 


Homo sapiens 


partial CDS 


4875 


99 


554 


AF002223 


Homo sapiens 


myotubularin related 1 


3490 


100 


555 


AC004893 


Homo sapiens 


similar to NEDD-4 CKIA0093); 
similar to P46934 (PID:gl 171682) 


1611 


100 


556 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


8328 


100 


557 


AJ404468 


Homo sapiens 


axonemal dynein heavy chain 


11137 


100 



140 



wo 01/57190 



PCT/USO 1/04098 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTTTV 


558 


X65873 


Homo sapiens 


kinesin heavy chain 


4860 


100 


559 


AJ277365 


Homo sapiens 


polyglutamine-containing protein 


592 


r 36 


560 


AF205600 


Homo sapiens 


transposase-like protein 


407 


27 


561 


• X71125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1914 


100 . 


562 


X71125 


Homo sapiens 


glutaminyl-peptide cyclotransferase 


1456 


97 


563 


X54304 


Homo sapiens 


myosin regulatory light chain 


897 


100 


564 


AF250842 


Drosophila 
melanopaster 


multiple asters 


130 


23 


565 


Y58608 


Homo sapiens 


Protein regulating gene expression 
PRGE-1. 


1619 


99 


566 


AL121893 


Homo aniens 


bAl 89K21 5 Ctiftvel orotein <iimilar 
to retinoblastoma binding protein 
(RBBP9)) 


1012 


100 


567 


ALl 17352 


Homo sapiens 


dJ876B10.2 (novel protein (ortholog 
ofratEX084)) 


3713 


99 


568 


AF228603 


Homo sapiens 


pleckstrin 2 


1841 


100 


569 


AF239243 


Homo sapiens 


histone deacetylase 7 


3244 


86 


570 


AF087695 


Mus musculus 


veli 3 


989 


100 


571 


AB046381 


Homo sapiens 


testis-abundant finger protein 


1346 


99 


572 


AC005551 


Homo sapiens 


R26529 2, partial CDS 


1020 


100 


573 


Y90290 


Homo sapiens 


Human peptidase, HPEP-7 protein 
sequence. 


274 


52 


574 


W76734 


Homo sapiens 


Human mDia Rho targeting protein. 


712 


32 


575 


AL121935 


Homo s^iens 


bA517H2.3 (t-complex 10 (a murine 
tcp.homolog)) 


853 


78 


576 


Y86217 


Homo sapiens 


Human secreted protein HWHGU54, 
SEQ ID NO: 132. 


2123 


99 


577. 


AL1217I6 


Homo sapiens 


dJ202D23.2 (novel protein) 


6329 


99 


578 


AL121716 


Homo sapiens 


dJ202D23. 2 (novel protein) 


6329 


99 


579 


X92715 


Homo sapiens 


KRAB /C2H2 zinc finger protein 


3102 


97 


580 


X54637 


Homo sapiens 


protein tyrosine kinase 


5564 


98 


581 


X78817 


Homo sapiens 


pll5 


1148 


44 


582 


AJ251245 


Rattus 
norvegicus 


SECTS bindine orotein 2 


3086 


71 


583 


AF113125 


Homo sapiens 


E-1 enz3'me 


581 


100 


584 


Ml 9529 


Sus scrofa 


folli*!tatin A 


1906 


98 


585 


AF169677 


Homo sapiens 


leucine-rich repeat transmembrane 
orotein FLRT^ 


3403 


100 


586 


D87685 


Homo sapiens 


similar to human transcription factor 
TFnS ('S34159'> 


8083 


99 


587 


Y00876 


Homo sapiens 


Human LAPH-1 protein sequence. 


2110 


100 


588 


Y99674 


Homn ^janieTT; 


Human rrTPa^f* a<i<!ftciatpH nrfitpin- 

25. 


2111 


9Q 


589 


D86973 


Homo ^laniens 


similar to Vea*it translation activator 

GCNl (P1:A48126) 


12033 


99 


590 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 


1979 


100 


591 


Y57396 


Homo sapiens 


Human lysoenzyme LYC4 
polypeptide. 


814 


100 


592 


AJ297743 


Mus musculus 


torsinB protein 


1448 


85 


593 


AF 164796 


Homo ^vanieTT! 


"WAT^H'iiTiiniiinonp nYiHnrRHiirtasp 

MLRQ subunit homolog 


469 


100 


594 


Y41312 


Homo QfiTiipriQ 


T^iimnn cpprpfpH nrntpin phpi'vIpH \wi 

XJ.LUXICU1 oCwlCLCU UIULCIU CllWvUCU UV 

gene 5 clone HLDRM43. 


740 


04 


595 


Y41312 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HLDRM43. 


824 


100 


596 


Y77123 


Homo sapiens 


Human neurotransmission-associated 
protein (NTAP) 998868. 


2102 


98 


597 


AF215703 


Drosophila 


KISMET-L long isofonn 


1880 


65 
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PCTAJSO 1/04098 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 
















melanogasier 








598 


AF070447 


Honio sapiens 


bairier-to-autointeoratioTi fartor 


290 


on 


599 


X56203 


Plasmodium 


liver stase antifyen 










falciparum 








600 


X79828 


IVTiiQ mii<iniliiQ 






jj 


601 


AB004109 


Cricetulus 


ohosohatidvl^ierine wnthasp TT ■ 

^xx^^^xxLx^x\xy iQwi uiw n J 1 1 1 1 itiiTv xx. 




7/ 














602 


U94988 


Mus musculus 


Nulpl 




so 


603 


U94988 


Mus musculus 


Nulpl 




00 


604 


AF006264 


Homo <ia'nie'n<? 


ll/VUlllL'JUJatJV^ll cUlU bi&lCl ^XUUlUaLlU. 




1 OA 

JUU 








uuncbiuii pruicm numoiog 






605 


AF006264 


Homo <!anien<i 


rPiTniTihinatin'n nnH cJctpr fVifntnoHH 




1 OA 
iUU 








cohesion orotein bnmnlop 






606 


X82260 


Homo sanien^ 


RanGAPl 


9070 


1 no 


607 


X82260 


Homn QflnifriQ 


RanOAPI 


1 

1 o*tj 


y / 


608 


AF 160909 


DrOQnnh i la 

x^i \JO yJlJ 1X11*1 






j6 






m elan fitya ctpr 








610 


X74801 


Homo ';anien<! 


framma QiiT^iinit n'Ff^f^T rViEn^prnnin 


974S 


OO 


611 


AL031427 


Homo ^oien*! 


dJ167A19 1 Tnovel nrotein"^ 


1 UUO 


1 AA 


612 


Y71072 


Homn ^niPriQ 

l-l-vl-llV/ OOX^lVllO 


Tinman m(*mliT*nTip ti^TiCT\nT+ nrntpin 

xximxail iliUlUUloUC UalloLJUi t UlUirCIUj 




1 AA 








MTRP-17. 






613 


X16396 


Homn <;anipnc 

XXwlllU OulJlbUD 




17/10 


1 AA 








315) 






614 


AK000281 


Homo sapiens 


unnamed orotein ■nrnrtiirt 


1 O It 


yy 


615 


AB01112S 


Homo saoien*; 

X X\^XX1\/ JUL^LWUO 


KIAA0556 nrntein 


*i;7^i 

J l\jy 


00 

yy 


616 


U19361 


Petromyzon 


NF-180 


9ns 


91 






marinus 








617 


AF045555 


Homo sapiens 


wbscrl 


1208 


inn 


618 


AF045555 


Homo sapiens 


wbscrl alternative policed riroHiirt 


1318 


inn 


619 


U22229 


Felis catus 


ribosomaJ protein L41 


128 


inn 

X VV 


620 


Y17169 


Homo sapiens 


A6 related protein 


1819 


ion' 


621 


Y 12065 


Homo sapiens 


hNop56 


2956 


OQ 


622 


AFl 77758 


Homo sapiens 


ubiquitin specific protease 1 6 


2998 


inn 


623 


AF3 17425 


Homo sanien^ 


GAC-1 




1 AA 


624 


AL050297 


Homo sapiens 


hvnofhetical nrntein 


1997 


00 

yy 


625 


AC007204 


Homn QanipTiQ 

X X\JILI\J OCILJIWXIO 


lJ\^Ji. 1 J^J7 1 




OQ 


626 


268747 


rTnTTin CflTMATlC 
ilUillU sta^iCilo 


uuugcn 




yy 


627 




^J/^TT^ ^ com An c 

numo bi^icns 


Imogen do 


1958 


97 


628 


Y70229 


r4nTTm cnniATic 
inv/iiiu ^o^iciio 


iuuiuaii xvtN/\-aSbociaxea proiein''JU 




99 








V^JVlN/xrVT" IV/ }, 






629 






HaSOpOaiyilgcal CorCmOma oSSOCIateu 


Ot:> 










gwUC piULCUl'O 






630 


AFl 19664 


Homn Qani^nc 


tr^n cpfir^tinn n 1 rpcnilcitnr r\i*rtfpin 




1 AA 
xUU 








HCNGP 






631 


AFl 19664 


Homo *iar>ien^ 

X X\JXH\J QUk'lvllO 


tranQfTi'n'Hnnal rpciilntnr nrn+pin 


1 1 sn 


©y 








HCNGP 






632 


Y 17849 


Homo sapiens 


o^anplioside-induced differpntiation 

ijM-l t^XX\f^±\x\^ UlUUw^U UAl^Vl ^UUClUVll 




05i 








associated T>rotpin 1 






633 


X55740 


Homo sapiens 


5 *-n u cl eoti d ase 


JV 1^ 


inn 


634 


AF039688 


Homo ^lanipn*! 


antiVen NY-PO-'^ 


7 J 1 


1 AA 
lUU 


635 


AFl 1 9662 


T-TnTTin cnnipnc 


J-^U UIULCUI 


/4Z4 


1 AA 


636 








Zj44 


1 AA 

100 


637 


AF077818 


Mus musculus 


syntropbin-associated serine- 


2027 


44 








threonine protein kinase 






638 


AL035455 


Homo sapiens 


dJ1018E9.1 (VAMP (vesicle- 


150 


26 








associated membrane protein)- 












associated protein B and C) 






639 


AF078844 


Homo s^iens 


hqp0376 protein 


416 


81 



142 



wo 01/57190 



PCT/USOl/04098 



SEQ 
ID 

NO* 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


640 


U28377 


Escherichia 
coli 


ORF C39-wasORF fl91 and 
ORF_fl 94 before splice 


1198 


100 


641 


AK024442 


Homo sapiens 


FLJ00032 protein 


1677 


56 


642 


U58682 


Homo sapiens 


ribosomal nrotein S28 


340 


100 


643 


X57432 


Rattusrattus 


ribosomal protein S2 


1520 


98 


644 


AB002348 


Homo sapiens 


KIAA0350 protein 


5186 


99 








TVfinnflR IrinQCP \\\t\A\t%o 
IKa.|J^aX> KUiaaC \l.r%J\,j UlUUJLUg 

nrotein Y7H5fi 


1 178 


71) 


647 


AB029482 


M^iiti TTiu^culu*; 


JNK-hinHinQ nrotein JMlf RPl 

Jl^XV UillUXli^ UlULblll Jl^lVl^JT 1 


4609 


O i 


648 




A r nHi t\ nnQ i i 

thaliana 


pontflinQ ciTtiilnritv tc\ icAntrivl 

^UULaLilo dJaililaJlL^ IV/ loUdJLllj'l 

acetate-hydrolyzing 
esterase— {'en e id'MOR2 1% 


407 


AA 


650 


AC002550 


Homo saoien*; 


Unknown peneorodiict 


858 




651 


U26592 


T-foTtio *ianien^ 


diflhete*; mpllitii*; tvnp T antoantiffpn 




uu 


652 


X60155 


Homo sapiens 


zinc finger 41 


4349 


100 






PI n tvn pr(*i c 

dumerilii 


H4 nrotpin /"AA 1 - 10'^^ 












R9704S 9 


Z-} JO 




KJJJ 


yvouH / J 


IV/Tiic TTiiicf^iiliie 


rnhlQ 




JD 






XvaLLUo 


tiiiAJiuWj] pjlylciil 




0^ 

7J 


657 




Hnmn cflDipnc 

X.l.\JllX\J 1^113 


Qimilar fn Rl^P trnnQfnrmiiiO' ■nmti*in" 

similar to P14373 fP^D■el32517^ 






658 


X92972 




nrotpin ■nhncrihatn^p 


xXJvv 




659 


L35269 




yirif finopr rvmtpin 




QO 


660 


AC003682 


Homo sapiens 


F18547 1 


3184 


96 


661 


X79204 


Homo sapiens 


ataxin-1 


4195 


99 


662 


X17620 


Homo sapiens 


Nm23 protein 


965 


99 


663 


AB015617 


Homo sapiens 


ELKS 


1501 


80 


004 




Homo sapiens 


interferon regulatory factor 3 


23 J 1 


100 




A To/loos'^ 


Pyrococcus 


T A r^X/^VT Cl'\ T IT" A TXIT/^TvTT3 

1.*A.L.1*JYHjL;U lAlrUUJNJb 
r VA<5F CVC 4 4 1 '>^ 

METHYLGLYOXALASE) 
fAT DOKFTOMTITA^iFI 
fGLYOXALASE X) 




40 


666 


Z70200 


Homo *i3nien<i 


[J5 snRNP-snecific 200kD nrotein 

Wa/ OllXVL^X SL/V/VXJLAVi' ^\/vrAJ_/ LflWI-wJJLl 


OO 




667 


Z70200 


Homo sapiens 


U5 snRNP-specific 200kD protein 


8589 


97 


668 


API s'?4.sn 




iii\/pnilp lir\rm/\np PCfprncp l^iTidino 
JUVCllllC IlUiiUUliC CdlCIOoC UliiUUig 

nrotpin 


99'; 




669 


AF227198 


Homo sapiens 


CrkRS 


7231 


99 


670 


X99586 


Homo sapiens 


SMT3C protein 


441 


87 


D / 1 


761 ^RO rHI 


nomo Sapiens 


1 /-Auo-iyyo i-'iNA encocung a 
human OC-2 protein. 




1 no 




A H '?97n9 

I\J IVJi 


ivius musLuius 


/\ X r a-ossociaieu locior 




oO 


673 


AF204159 


Homo sapiens 


potassium large conductance 
caiciuiu-aciivaicu cnannci ucia ja 

aUUUUlL 


I486 


100 


674 


G02061 


Homo sapiens . 


Human secreted protein, SEQ ID 
NO: 6142. 


558 


99 


675 


fi0194fi 


MrtmA cartipTic 


Human cp/M'pf'pd nmtpin QT^O I f'S 
nUlllQli dCw CiCU UlULClllj ox^v^ xx^ 

NO: 5327. 


141 


77 


din 


Aijuioojy 


nomo sapiens 


mobl 




HZ 


677 


.D86970 


Homo sapiens 


similar to myosin heavy chain: 
Oontaininp" ATP/CiTP-Hindin? sitp 
motif A(P-loop) 


161 


28 


678 


U83115 


Homo sapiens 


non-lens beta gamma-crystallin like 
protein 


8569 


99 


679 


AF203687 


Homo sapiens 


prolactin regulatory element-binding 
protein 


2181 


100 



143 



wo 01/57190 



PCT/USO 1/04098 



SEQ 

TT\ 
11/ 

NO: 


ACCESSION 

iNUiVLl>J:!J\ 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

iDENnrv 


680 


M27685 


Mus musculus 


ultra-high sulphur keratin 


650 


. -JO 


681 


U04968 


Cricetulus 
griseus 


nucleotide excision repair protein 


3712 


07 


682 


AFl 19663 


Homo sapiens 


G-protein gamma- 12 subunit 


356 


100 


683 


G03733 


Homo sapiens 


Hiunan secreted protein, SEQ ID 
NO: 7814. 


342 


100 


684 


X67699 


Homo sapiens 


CDw52 antigen 


297 


100 


685 


AF022789 


Homo sapiens 


ubiquitin hydrolyzing enzyme I 


1892 


100 


686 


AJ001006 


Mus musculus 


EMeg32 protein 


938 


96 


687 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


688 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


689 


AF156557 


Homo sapiens 


stomatin related protein 


2036 




690 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID 
NO: 8041. 


593 


ion 


691 


AF161512 


Homo sapiens 


HSPC163 


738 


100 


692 


AL031115 


Homo sanien^ 


ZXDA 7XDB fTinc finoer X-linkp^ 
protein) 






693 


L40410 


Homo ^aniens 


tVivrniH rftf*pntnr intprartftr 




1 {\n 


694 


AC004542 


Homo sapiens 


OXYSTEROL-BINDING 
PROTEIN-Ilke- "similar tn P970';Q 
fPIDel293081 


2533 


99 


695 


AF169411 


Rattus 
norvegicus 


PAPIN 


4144 




696 


Y58168 


Homo sapiens 


Human hydrolase homologue HHH- 

4. " ' 


2144 


100 


697 


AF27I994 


Homo sapiens 


dopamine responsive protein DRG-1 


1613 


100 


698 


Y41741 


WoTTir» ^firiipnQ 


HiiTnan PTJ07n4 nrrtt*»in cAnii^nf»f» 






699 


AL133506 


Unknown 


/prediction=(raethod:""genscan"", 
version-""! 0"" <!cnre-""100 1 
/prediction={method: 


825 


48 


700 


Y96870 


Homo sapiens 


I-TiimaTi crnnQp-tvnp IvcrvyvmA 

(GOLY). 






701 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
specific (KS) gene 


1190 


ion 


702 


AC003034 


Homo sapiens 


Gene with similarity to rat kidney- 
speciflc (KS) gene 


937 


95 


703 


AJ242832 


Homo sapiens 


calpain 


3756 


100 


704 


S52624 


Homo sapiens 


unknown 


185 


100 


705. 


AF005081 


Homo sapiens 


skin-specific protein 


652 


100 


706 


Y16793 


Homo sapiens 


keratm, type I 


2232 


100 


707 


Y44985 


Homo sapiens 


Human epideimal protein-2. 


455 


69 


708 


AFl 13220 


Homo sapiens 


MSTP040 


686 


100 


709 


Y44985 


Homo sapiens 


Human epidermal protein-2. 


408 


65 


710 


Y16132 


•Homo sapiens 


CDT6 


1874 


100 


711 


Y68775 


Homo sapiens 


Amino acid sequence of a human 
phosphorylation effector PHSP-7. 


2407 


100 


712 


X63422 


Homo sapiens 


H(+)-transporting ATP synthase 


209 


100 


713 


AFl 69968 


Mus musculus 


DNA binding protein DESRT 


1467 


79 


714 


X52563 


Bos taurus 


permability increasing protein 


383 


29 


715 


AJ277739 


Homo sapiens 


RPB 1 1 b 1 alpha protein 


480 


98 


716 


AL135791 


Homo sapiens 


bA162G10.3 (zinc finger protein) 


401 


98 


717 


AF223466 


Homo sapiens 


HT015 Drotein 




07 


719 


AFl 17383 


Homo sapiens 


placental protein 13; PP13 


746 


100 


720 


Z98743 


Homo sapiens 


dJ181C9.2 (Rho GTPase activating 
protein 8 (WioGAP, p50RhoGAP)) 


324 


100 


721 


AL163815 


Arabidopsis 
thaliana 


putative protein 


653 


61 


722 


G0I436 


Homo sapiens | Human secreted protein, SEQ ID | 


418 


96 
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ACCESSION 


, SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 








WATERMAN 


IDENTITY 


NO: 








SCORE 










NO: 5517. 






723 


AF282919 


Mus tnusculus 


Z^228 






724 


AB023191 


Homo sflpicns 


KIAA0974 Drotein 


90*; 




725 


AL031778 








lUU 








fbenzodiayjinine rf*rj»ntr»r ^^npnnhpral^ 












n^R,PBR.PBKS IBP 












Isoquinoline-binding protein)) LIICB 












protein) 






726 


AL021939 


Homo sapiens 


dJ352A20.2 (aldehyde 


1764 


inn 








dehydrogenase family protein) 






727 


AF 182426 . 


Rattus 


arylacetamide deacetylase 


791 


49 














728 


Y08565 


Homo SEpiens 


UDP-GalNAc:polypeptide N- 


3331 


00 






acety Igalactosaminy Itransf erase 






729 


AF155135 


Homo sspiens 


novel retinal ni&^me»nt enithplial rell 




QQ 

yy 








protein 






730 


AL078606 


AraH i doD*> i *i 


niitJitivp nrAtpin 


977 


zc 
J J 






thaliana 








731 


Y73352 


Homo sapiens 


H TKM clone 1732368 nrotein 




1 nn 








sequence. 






732 


AF178432 


Homo sapiens 


SW^ nrotein 




1 nn 


733 


Y17832 


JL mil 1 <ii 1 






1A 

J4 






endogenous 












retrovirus K 








734 


Y28859 


Homo sapiens 


T-Tiirnan itipQnHprm inHiipfinn psirlv 




- OS 
yo 








resnonse nrotein FRl 






735 


U09355 


Oryctolagus 


Drotein nho^hatase 2A1 R pamma 




QO 
yy 






cuniculus 


subunit 






736 . 


Y94922 


Homo sapiens 


Human secreted protein clone pv6 1 


724 


99 








protein sequence SEQ ID NO:50. 






737 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


738 


AFl 12200 


Homo sapiens 


NADH-oxidoreductase B 1 8 subunit 


739 


100 


739 


AF112200 


Homo sapiens 


NADH-oxidoreductase B18 subunit 


613 


88 


740 


AF302154 


Homo sapiens 


SPG protein 


6556 


100 


741 


B25681 


Homo sapiens 


Human secreted protein sequence 


1410 


99 








encoded by gene 17 SEQ ID NO:70. 






742 


L27479 


Homo sapiens 


X123 


1237 




743 


L27479 


Homo sapiens 


X123 


1206 


Q7 


744 


Y66745 


Homo sapiens - 


Membrane-bound protein PROl 186. 


588 


99 


745 


AJ001019 


Homo sapiens ■ 


ring finger protein 


1292 


99 ■ 


746 


X68453 


Sus scrofa 


tubulin-tyrosine ligase 


1882 


94 


747 


Y57897 


Homo sapiens 


Human transmembrane protein 


1173 


100 








HTMPN-21. 






748 


AF151069 


Homo sapiens 


HSPC235 


1694 


96 


749 


AFl 82404 


Homo sapiens 


mitochondrial uncoupling protein 1 


1674 


100 


750 


AL121993 


Homo sapiens 


dJ776P7.1 (Novel protein) 


2500 


QQ 

yy 


751 


AF149825 


Homo sapiens 


PACSIN3 • 


2253 


100 


752 


AL008635 


Homo sapiens 


dJ510H16 2 niieh-mohilitv oroiin 




00 

yy 








protein 2-like 1) 






753. 


Y57914 


Homo sapiens 


Human transmembrane protein 


1124 


100 








HTMPN-38. 






754 


AF285109 


Homo sapiens 


septin 3 isoform B 


1766 


100 


755 


AF004161 


Oryctolagus 


peroxisomal Ca-dependent solute 


2371 


Q') 
yj 






cuniculus 


carrier 






756 


Z19585 


Homo sapiens 


thrombospondin-4 


4239 


100 


757 


AP001745 


Homo sapiens 


similar to zinc finger 5 protein 


1857 


100 


758 


AF190664 


Mus musculus 


LMBR2 


555 


72 


759 


AF090326 


Mus musculus 


AE-1 binding protein AEBP2 


1540 


97 


760 


AL096677 


Homo sapiens 


dJ322G13.3 (novel protein similar to 


999 


94 
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IB 
NO: 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITy 








bovine and mouse beta-soluble NSF 
attachment protein (SNAP-beta) ) 






761 


AC003007 


Homo sapiens 


Unknown gene product (partial) 


649 


96 


762 


U66372 


Bos taurus 


ribosomal protein S29 


230 


73 


764 


Y90899 


Homo sapiens 


Dl-like dopamine receptor activity 

moHifvino nrntpin ^FO IT) MO-I 


1152 


100 


765 


U88169 . 


Caenorhabditis 
elegans 


similar to molvhdoterin hio^vnthe^i^ 
MOEB proteins 


1204 




766 


ALl 18506 


Homo sapiens 


dJ591C20.3.1 (novel DnaJ domain 
orotein similar to mouie and hovine 
cysteine string protein) 


1091 


100 


767 


AK024693 


Homo sapiens 


unnamed protein product 


3767 


100 


768 


Z11518 


Homo sapiens 


histidyl-tRNA synthetase 


2582 


100 


769 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA 
-19 to 4525) 


25529 


100 


770 


AC009360 


Arabidopsis 
'thaliana 


Contains 3 PF100400 WD40, G-beta 
reneat rinmaios 


333 


33 


771 


AB037685 


Mus musculus 


LANP-like nrotein 


1246 




772 


AL161578 


Arabidopsis 
thaliana 


putative protein 


335 


46 


773 


AL161578 


Arabidopsis 
thaliana 


putative protein 


333 


47 


774 


AY008271 


Homo sapiens 


helicase SMARCADl 


5264 


99 


775 


Y21591 




HiimaTi QAPrpfpH r^rrttpin ^e^lnn/^ 

xLULllaLl aCi/lClCU piL^LClIi ^UiuXlv 

CC332-33). 


7 1 97 




776 


W88853 


Homo ^aniPTi*! 


Pf^lvnPnfiHp frjiomPTif" pnpr\HpH 
r Kjiyyj^yusAC lid^uiciiL ciiwtjucu uy 

aene RQ 


/ Jj^ 


inn 


777 


W88853 


Homo *;am'pn<i 


PolvnPntirfp frflOTTipnt pnr*oHpH Kv 
i. uij'^&uiiuc xKi^LiidiL ciiw^ucu uy 

gene 89. 


1 Jit 


inn 


778 


W88853 


Homo sapiens 


Polypeptide fragment encoded by 
gene 89. 


752 


100 


779 


AF 196481 


Homo sapiens 


RING finger protein; FXY2 


3644 


100 


780 


AL035427 


Homo sapiens 


dJ769N13.1 (KIAA0443 protein.) 


1609 


54 


781 


AB026187 


Homo sanien*! 


nrotoca dh erin -X^a 


5244 


inn • 


782 


B24458 


Homo s^ien*? 


Human secreted nrotein spniienra 
encoded by gene 22 SEQ ID NO: 83. 


1002 


inn 


783 


AB027289 


Homo sapiens 


cvclin-E binding orotein 1 


5421 


inn 


784 


G02916 


Homo s^iens 


Human secreted protein, SEQ ID 
NO: 6997. 


627 


100 


785 


AJ245822 


Homo sapiens 


type I transmembrane receptor 


4560 


100 


786 


AJ245820 


Homo sapiens 


type I transmembrane receptor 


4624 


100 


787 


Z48042 


Homo sapiens 


GPI-anchored protein pl37 


3340 


99 


788 


AL031782 


Homo sapiens 


dJ708F5.1 (PUTATIVE novel 
Collagen alpha 1 LIKE protein) 


2739 


100 


789 


AJI31245 


Homo sapiens 


Sec24B protein 


6602 


100 


790 


AFl 07203 


Homo sapiens 


ataxin 2-binding protein 


2008 


100 


791 


Y14690 


Homo sapiens 


procollagen alpha 2(V) 


600 


34 


792. 


AL031055 


Homo sapiens 


dJ28H20.2 (novel protein) 


1267 


100 


793 


Y36194 


787 


Human secreted protein 


2051 


99 


794 


AB028127 


Homo sapiens 


mannosyltransferase 


2138 


96 


795 


AC007228 


Homo sapiens 


R31665 2 


2738 


79 


796 


AL049482 


Arabidopsis 
thaUana 


putative protein 


436 . 


47 


797 


AC004528 


Homo sapiens 


R32184 3 


891 


91 


798 


AB037830 


Homo sapiens 


KIAA 1409 protein 


7532 


100 


799 


X53793 


Homo sapiens 


5' half of the product is homologues 
to Bacillus subtiis SAICAR 
synthetase, 3' half corresponds to the 
catalytic subunit of AIR carboxylase 


2232 


100 
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SEQ 
ID 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


800 


Y99350 


Homo sapieas 


Human PR01378 (UNQ715) amino 
acid seauence SEO ID NO'33 


1343 


100 


801 


AB042636 


Homo sapiens 


junctophilin typeS 


1225 


47 


802 


AB029324 


Rattus 
norvegicus 


TlP120-family protein TIP120B 


3916 


90 


803 


AB029324 


Rattus 
norvegicus 


TlP120-family protein TrP120B 


4961 


90 


804 


AF251040 


Homo sapiens 


putative nuclear protein 


2119 


100 


805 


AB033281 


Hnmo ^anien*; 


K-hnv !)nH WT^-rpnpatQ nrntpin Kptn- 

TRCP2 isoform C 


7870 


inn 


806 


U87305 


Rattus 
norvegicus 


tranompmhraTip rpppntnr TTMf^SWl 


3257 


on 


807 


AF118889 


Rattus 
norvegicus 


b-tomosvn isoform 




07 

y 1 


808 


AF226993 


Rattus 
norvegicus 


Qplpptivp T TAA hiTiHino factAr 


o / yj 


OS 
yj 


809 


W19919 




Ras). 


jyjy 


00 

yy 


810 


ALG3 1 782 


f-fnTno caniPTiQ 


UJ / VOX > 1 ^JT \J JL XV xxy xJ iiU VCl 

Collaffen alnha 1 L.nCH Tjrotein^ 




inn 


811 


AC002542 


Houio sflpicns 


similar to C elepans Fl 1 AlO 5- 80% 
similarity to Z68297 (PIDrgl 130619) 


2294 


inn 


812 


U83246 


Homo sapiens 


copine I 


606 


52 


813 


AF242552 


Gallus pallus 


retinovin 


945 




814 


X52332 


Homo sapiens 


TiTic fin&er nrotein 10 


1651 


93 


815 


X52332 


Homo sapiens 


zinc finger protein 10 


2423 


99 


816 


Y09631 


Homo sapiens 


PIBFl protein 


2935 


99 


817 




Rattus 




JOOD 


oc 
yo 


818 


AY004877 


Mus musculus 


cytoplasmic dynein heavy chain ' 


11105 


98 


819 


Y27196 




n.uixiaii Lfj^uiiL/ uuoicutiuc 

phosphodiester PDE8B(B) ammo 

flpifl QpniiPTirp 


J iy\i 


inn 


820 


AF081947 




tektin 


1 134 


81 


821 


AL035106 


Homo tianien*; 


dT998Cl 1 1 rcontimies in 
Em:AL445192 as bA269H4.1) 


871 


inn 


822 


AF022795 




TOT* hptn rpp^ntrtr ?iQQnf*i5if"Pr1 ■nrfitpin— 

1 




94. 


823 


AFO 15770 


Mus mii*iciilu'? 


radical frinfe 


1422 


89 

Oi 


824 


U82695 


Homo ^aniens 


exDressed-Xa28STS nrotein 


1444 


00 
yy 


825 


X77371 


Mesocricetus 
auratus 


CORl 


641 


78 


826 


AB014576 


Homo sapiens 


KIAA0676 nrotein 


296 


79 


827 


AL049733 


Homo sapiens 


dJ875H3.1 (APKl antigen) 


1584 


72 


828 


AF222980 


Homo sapiens 


disrupted in Schizophrenia 1 protein 


4418 


100 


829 


Z31560 


Homo <;aniens 

X XvXXXK^ OULJ XvXXJ 


sox-2 


1683 


100 


830 


AF295773 


Homo "ianiens 

X X\^XXX\J OUL/XwlU 


ral Piianine nucleotide di'isociatioTi 
(Stimulator 


4717 


00 

yy 


831 


AB041926 


Homo sapiens 


GCK femilv kinase MINK-2 


6866 


100 


832 


L04948 


Saccharomyce 
s cerevisiae 


mitochondrial tran<;Dorter nrotein 


338 


35 


833 


AJ007012 


Mus musculus 


Fish protein 


704 


94 


834 


Z34289 


Homo sapiens 


nucleolar phosphoprotein pl30 


3455 


99 


835 


U10991 


Homo sapiens 


G2 


8436 


98 


836 


AF230877 


Homo sapiens 


MIP-T3 


2945 


99 


837 


X58288 


Homo sapiens 


protein-tyrosine phosphatase 


7734 


99 


838 


X56958 


Homo sapiens 


ankyrin (brank-2) 


9631 


100 


839 


AC024791 


Caenorhabditis 
elegans 


contains similarity to beta-lactamases 


370 


24 
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ID 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


840 


D83197 


Homo sapiens 


ankyrin repeat protein 


SO? 


OQ 

yy 


841 


AF053711 


Scrinus 
canaria 


neurofilament medhim <:iihjinif 




"X 1 

i 1 


842 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosoraal 
protein LIO encoded by GenBank 
Accession Number L25899 


990 


96 


843 


U76343 


Homo sapiens 


O AB A tran ^nort orotein 


70Q9 


Ofi 


844 


Y 13645 


Homo sapiens 


uronlakin TT ' 


R97 




845 


D21064 


Homo sapiens 


similar to rat general mitnchnnrfrial 
matrix processing protease mRNA 
(RATMPP). 


9710 




846 


AF 192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


7047 


100 


847 


AF192522 


Homo sapiens 


Niemann-Pick C3 protein; NPC3 


5472 


100 


848 


X60489 


Homo sapiens 


elongation fector-l-beta 


1162 


100 


849 


AC007204 


Homo sapiens 


BC273239 1 


2277 


fn 
\f I 


850 


AC003682 


Homo sapiens 


R28830 1 


2401 


inn 


851 


AL121583 


Homo sapiens 


bA358N2.1 (novel protein) 


353 


61 . 


852 


Z48475 


Homo sapiens 


p'lucokina*%ft repiilatnr 




00 

yy 


853 


Z83844 


Homo *;ariiens 


(iJ37E16 2 C<?H3-domain hindimr 
protein 1) 


J 00*T 


Ofi 

yo 


854 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


390 


36 


855 


AF062741 


Pflthic 
norvep'iciiti 


pyr uvaLc ucjiyurugcuadc puuspnaLoSc 




oU 


856 


Y11411 


Hoino sspicns 


nri*!tanovl-OoA oxidasp 




0$! 

yo 


857 


M97188 


Stronff vl ocentr 
otus 

purpuratos 


telctin A1 

LuA.LUl.^^1 


900 




858 


AB001105 


Homo sapiens 


hiDOOcalcin-like orotein 4 




inn 


859 


AF164791 


Homo sapiens 


nutative IJ? '^kDa nrntein 


X I 


inn 


860 


AF298117 


Homo sapiens 


homeobox nrotein OnrX"? 


x*^ 1 1 


0*5 


861 


AF0I5264 


Rattus 
norvegjcus 


golgi peripheral membrane protein 
p65 


1820 


81 


862 


X16901 


Homo sapiens 


30kb subunit of RAR'^O /74 


19R4 


1 on 


863 


M12140 


Homo sapiens 


envelope protein 


202 


81 


864 


AF161459 








QC 
yo 


865 


AL109983 




UJ / X or X 1.1.1 \^iiu vci ^icu3 XI 
aminnfTanQfpTflQP tiimilnr fn cprinp 

palmotyltransferase (isofonn 1)) 


AAA 


1 f\r\ 


866 


M77183 


Rattus 
norvegicus 


aloha- 1 -macrofflobulin 


227 


4S 


867 


AF272663 


Homo sapiens 


ffeDhvrin 


3785 


1 w 


868 


X75285 


Mus musculus 


fibulin-2 


3258 


87 


869 


X82494 


Homo sapiens 


fibulin-2 


3407 


yy 


870 


AJ297743 


Mus musculus 


torsinB protein 


169 




871 


AJ278313 


Homo sapiens 


phospholipase C-beta-la 


6258 


99 


872 


AF073344 


Homo saoiens 


iihiauitin-^ripcific nrntPAQp 






873 


Y91955 


Homo sapiens 


Human cytoskeleton associated 
nrotein 10 rcYSKP-lO") 


535 


100 


874 


AJ000414 


Homo sapiens 


Cdc42-intpraftinff nrntpin 4 


1 1 JU 




875 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain 
enTvme APOT T ON 


627 


100 


876 


Y48586 


Homo sanieni 


TTiiTTifln Krpnct tiimmiT-accnrintAH 

protein 47. 


9^'?7 


06 
yo 


877 


AF182198 


Homo sapiens 


intersectin 2 long isofonn 


8764 


99 


878 


L17308 


Gossypium 
hirsutum 


proline-rich cell wall protein 


192 


35 


879 


AF177169 


Homo sapiens 


tropomodulin 2 


1769 


100 


880 


W03627 


Homo sapiens 


Human follicle stimulating hormone 
GPR N-tenninal sequence. 


210 


23 
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ID 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


881 


AL021068 


Homo sapiens 


dJ206D15.3 


2615 


99 


882 


AC005498 


Homo sapiens 


R31665 2 


318 




883 


AF165518 


Homo sapiens 


MAGOH isoform 


182 


94 


884 


D21211 


Homo sapiens 


protein tyrosine phosphatase (PTP- 
BAS, type 3) 


368 


43 


885 


U 13045 


Homo sapiens 


nuclear respiratory factor-2 subunit 
beta 1 


869 


62 


886 


X52836 


Homo sapiens 


tryptophan hydroxylase (AA 1 - 444) 


2320 


98 


887 


X51466 


Homo sapiens 


elongation factor 2 


4460 


100 


888 


AB039903 


Homo sapiens 


interferon-responsive finger protein 1 
long form 


1096 


98 


889 


X51760 


Homo sapiens 


23nc finger Drotein ^583 AA*^ 


3130 


inn 


890 


AJ243396 


Homo sapiens 


voIta.&e- sated sodium channel beta-^ 
subunit 


1024 


inn 


891 


W67928 


Homo sapiens 


Fragment of human secreted protein 
encoded by gene 4. 


391 


100 


892 


AB020598 


Homo sapiens 


peptide transporter 3 


3017 


100 


893 


Y66648 


Homo sapiens 


Membrane-bound nrotein PRO! 1 70 


4722 




894 


Y66648 


Homo sapiens 


Membrane-boiind nrotein PRO! 170 


3606 




895 


A29218 cd 
1 


Homo sapiens 


19-NOV-1998 DNA encodine G- 
protein coupled 7 TM receptor with 
AX0R15 activity. 


2178 


inn 

1 \3\} 


896 


AJ000332 


Homo sapiens 


Glucosidase II 


5063 


99 


897 


X98259 


Homo sapiens 


M-phase phosphoprotein 8 


1085 


100 


898 


X57110 


Homo sapiens 


c-cbl protein 


4849 


99 


899 


X63652 


Homo sapiens 


inter-alpha-trypsin inhibitor heavy 
chain ITIHl 


3376 


98 


900 


X85134 


Homo, sapiens 


RB protein binding protein 


2816 


99 


901 


LI 1672 


Homo sapiens 


zinc finger protein 


2047 


58 


902 


Y85565 


Homo sapiens 


Human homologue of UNC-53 (Hs- 
UNC-53/2) sequence. 


369 


83 


903 


X54871 


Homo sapiens 


ras related protein RabSb 


1094 


100 


904 


Z98265 


Homo sapiens 


plakophilin 3 


4065 


100 


905 


AL035295 


Homo sapiens 


hypothetical protein 


959 


99 


906 


AF051782 


Homo sapiens 


diaphanous 1 


801 


35 


907 


AF208536 


Homo sapiens 


nucleotide binding protein; NOP 


1372 


100 


908 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2365 


98 


909 


U79240 


Homo sapiens 


serine/threonine protein kinase 


2386 


99 


910 


AJ132545 


Homo sapiens 


protein kinase 


2921 


100 


911 


AJ 132545 


Homo sapiens 


protein kinase 


1637 


99 


912 


AL121733 


Homo sapiens 


hypothetical protein 


1344 


99 


913 


Y67579 


Homo sapiens 


Human death inducer-obliterator 1 
(DIO-1) polypeptide. 


1586 


100 


914 


X87342 


Homo sapiens 


Human giant larvae homologue 


5317 


99 


915 


X87342 


Homo sapiens 


Human giant larvae homologue 


3495 


96 


916 


M94362 


Homo sapiens 


laniin B2 


2357 


93 


917 


AJ011654 


Homo sapiens 


triple LIM domain protein 


3432 


100 


918 


AJ131899 


Rattus 
norvegicus 


proline rich synapse associated 
protein 1 . 


5776 


88 


919 


AF054986 


Homo sapiens 


putative transmembrane GTPase 


1816 


100 


920 


U95822 


Homo sapiens 


putative transmembrane GTPase 


1237 


100 


921 


Y11588 


Homo sapiens 


apoptosis specific protein 


1492 


100 


922 


X84195 


Homo sapiens 


acylphosphatase 


510 


100 


923 


U72882 


Homo sapiens 


interferon-induced leucine zipper 
protein 


1409 


99 


924 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


925 


AF126245 


Homo sapiens 


acyl-Coenzyme A dehydrogenase-8 
precursor 


2162 


100 
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SEQ 
ID 
NO: 


ACCESSION 
1NUMB£R 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


926 


AE001968 


Deiuococcus 
radiodurans 


hypothetical protein 


147 


97 
£- 1 


927 


W81576 


Homo sapiens 


EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 


1778 


100 


928 


U01317 


Homo sapiens 


beta-globin 


687 


94 


929 


X98333 


Homo sapiens 


organic cation transporter 


2933 


100 


930 


Y91444 


Homo sapiens 


Wirmnn Q(*f*rptpH ■nrr»tpin c^miAnnp 

XXUIUCLU OC^lCLCU UlULvlU ^CUuCUwC 

encoHed hv ffpnp 49 SFO TTi 

Nd:165. 




1 f\f\ 


931 


Y91644 


Homo sapiens 


Human secreted protein se(]uence 
encoded by gene 43 SEQ ID 
NO:317. 


1743 


inn 


932 . 


D90279 


Homo sapiens 


collagen alpha 1(V) chain precursor 


569 


39 


933 


Z31560 


Homo sapiens 


sox-2 


1587 


96 


■934 


AF147790 


• Homo sapiens 


transmembrane mucin 12 


3047 


99 


935 


Z85996 


Homo sapiens 


match: multiple proteins; match: 
Q08 1 5 1 P28 1 85 QO 1 1 11 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
•P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 

Tnat/*>i' PI (\QAQ PI 1 AO*3 r\ \ <iCiA Q 

maicn. riyjy^y Jrl lUzj V<loy4o 

CYXYXT^n- TtiQ+rVi* ri9^'5RO "DO^OOC 
kI^^JDj /, malCD. v^ZJjOj r^ZDZZo 

P7n^'^fi PflS71"^* mntrh* P'^^97/^ 
O08I47 P17609 matrh* 

Q15771 P36410 P35291; GTP- 
binding 


726 


94 


936 


AB041533 


Homo sapiens 


sperm ^tigen 




JO 


937 


X91906 


Homo sapiens 


voltage- sated chloride ion charmel 






938 


AB032481 


Homo sapiens 


honieobox transcription factor 


1744 




939 


AF131106 


Homo sapiens 


protein serine/threonine phosphatase 
4 regulatory subunit 1 


4682 




940 


Y 17999 


Homo sapiens 


DyrklB protein kinase 


3331 


yy 


941 


AF305872 


Homo sapiens 


thyroglobuiin 


455 


92 


942 


AF263462 


Homo sapiens 


cinpiilin 




00 


943 


AK024442 


Homo sapiens 


FU00032 protein 


1616 


61 


944 


Y35911 


Homo ^aniens 


pYtpn/lpfl hiimnn cprrftAH nrfitAin 
joAi&uu&u uuuicui Avvicicu piuiciil 

sequence, SEQ ID NO. 160. 


ZDZ 


ij 


945 


ABO 15320 


Homo sapiens 


sigmalB subunit of AP-1 clathrin 
adaptor complex 


599 


71 


946 


Z82287 


Caenorhabditis 
elegans 


ZK550.2 


990 


J J 


947 


D84223 


Homo sapiens 


leucvl tRNA *!vntheta<!e 


(OKI 


yy 


948 


U49057 


Rattus 
norvegicus 


rA9 


3846 


62 


949 


AK000568 


Homo sapiens ■ 


iinnampH nrntpin nrrtHnnt 


1 ^^0 
10 J7 




950 


AL021578 


Homo sapiens 


dT4S'^C19 6 1 /'iinrharnpff»ri'7(='H 
hvDothalamus nroteifi A'<:f>fnmri \\\ 


Z J / 




951 


AB032435 


Homo sapiens 


difFerentiation-a'!<iociatf*H l^Ia- 
dependent inorganic phosphate . 
cotransporter 




00 
yy 


952 


AFl 10532 


Homo sapiens 


uncoupling protein UCP-4 


1561 


100 


953 


X83587 


Mus musculus 


1A13 protein 


1420 


59 


954 


AL031665 


Homo sapiens 


dJ545L17.5.1 (novel protein) 


386 


53 


955 


Y87600 


Homo sapiens 


Human fatty acid synthase-Iike 
protein (HFASLP). 


2377 


100 


956 


Y99421 


Homo sapiens 


Human PRO 1433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 


522 


55 
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SEQ 
ID 
NO: 


ACCESSION 

TWIT Ttk ^ ¥> TI*¥~> 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


957 


U68535 


Mus musculus 


aldo-keto reductase 


451 


73 


958 


AC007067 


Arabidopsb 
thaliana 


T10O24.10 


1594 


57 


959 


U72194 


Mus musculus 


muskelin 


3947 


99 


960 


AE003661 


Drosophila 
melanogaster 


CG15168 gene product 


277 


54 


961 


X80332 


Mus musculus 


rab20 


983 


82 


962 


Y67315 


Homo sapiens 


Human ^^xeA&A nrotein RT.89 1 3 
amino acid sequence. 


3916 




963 


Y67315 


Homo sapiens 


Human secreted protein BL89_13 
amino acid sequence. 


3916 


99 


964 


L32602 


Rattus 
norvegicus 


homeodomain 159. .341 


1821 


96 


965 


Z97832 


Homo sapiens 


dJ329A5.3 (KIAA06460 protein) 


3581 


99 


966 


W88995 


Homo sapiens 


Polypeptide fragment encoded by 
gene 146. 


176 


39 


967 


U12465 


Homo sapiens 


ribosomal protein L35 


604 


100 


968 


AF151803 


Homo sapiens 


CGI-45 protein 


1101 


78 


969 


W74865 


Homo ^ianien^ 


Hiimfin <iprrptpH nrrttpin pnrr*HpH 

J. J.*J 1 1 14111 obwlVtwU L/IVLwUl wXlVUUwU. L/Y 

gene 137 clone HMWIF35. 


1148 


OR 

I'D 


970 


L21936 


Homo ^nien*; 


AUW^illCllC UvlJ.jrUI U^CllCuv JlluVv/Ul UlCul 

subunit 






971 


AJ133521 


DrnQnnliilfl 

buzzatii 


nrntPJiQP rpvprcp trstncprirvtsicp 

ribonucle^e H, integrase 


1Q4 




972 


AC006017 


Homo sapiens 


N-acetvIoalactosaminvltransferase* 
similar to Q10473 (PlD:gl 709559) 


3271 


100 


973 


Z81317 


Schizosacchar 
omyces pombe 


DNA2-NAM7 helicase familv 
protein 


685 




974 


M17885 


Homo sapiens 


acidic ribosomal phosphoprotein (PO) 


792 


100 


975 


U22829 


Mus musculus 


P2Y purinoceptor 


399 


40 


976 


AL132772 


Homo sapiens 


dJ1013A2'' 1 fheoatic nuclear factor 
4, alpha) 


2466 


99 


977 


AC003973 


Homo sapiens 


ZNF91L 


1550 


43 


978 


J04031 


Homo sapiens 


MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 
EC 6.3.4.3) 


2824 


63 


979 


AF136715 


Homo sapiens 


taxol resistant associated protein 


217 


76 


980 


AF136715 


Homo saoien'? 


taxol re^i*itant as'sociatefl nrotein 


306 




981 


Z92822 


Caenorhabditis 
elegans 


ZK520.1 


1109 


44 


982 


AJ295149 


Homo sapiens 


putative dipeptidase 


1564 


99 


. 983 


AL021331 


Homo sapiens 


dJ366N23.3 (KIAA0173 and 
Tubulin-Tyrosme Ligase LIKE) 


1492 


100 


984 


AL16I501 


Arabidopsis 
thaliana 


putative adenosine deaminase 


370 


38 



TABLES 



SEQ 


ACCESSION 


DESCREPnON 


RESULTS* 


m 


NO. 






NO: 








2 


BL00282 


Kazal serine protease inhibitors femily 
proteins. 


BL00282 16.88 4259e-14 97-120 


3 


BL00298 


Heat shock hsp90 proteins family 
proteins. 


BL00298A 10.97 l.OOOe-40 74- 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F1121 l.OOOe- 
40 409-464 BL00298H 20.50 
1. 000e-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 
ID 

NO- 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00298B 15.64 1290e-39 134- 

465-520 BL002981 30.07 7.81 8e- 
34 661-715 BL00298D 17.97 


4 


PR00237 


RHODOPSIN-LIKE GPCR 
STTPFRFAMTI Y RinNATTTRF 


PR00237A 11.48 4.316e-13 57-82 


5 


PD02454 


! ! ! ! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 


PD02454B 1 1.61 4.309e-17 75- 
103 






FfiF-T TTfF r»nMArM 


'nA/fnnR/?^ A 1^017 f^o oc 
j-'iviuuoo*f/v \.ju.v /.*fzye-ui/ Vo- 

\.yy 


7 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237A 11.48 1.750e-ll 29-54 
PR00237D 8.94 7.000e-09 138- 

61-83 


9 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.667e-15 272-289 


10 


BL00139 


Eukaryotlc thiol (cysteine) proteases 
cysteine proicins. 


BL00139D 9.24 4.400e-l 1 391- 
67-77 


12 


BL01113 


Clq domain proteins. ■ 


BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.184.857e-ll 

IjI-III BLUl I IjU /.4/ z.lole- 

10 790-800 


1 J 


RT m 1 
I> J_<U 1 1 i J 


Clq domain proteins. 


oiMiiiio lo.zo j.8lje-J4 jyy- 
635 BL01113C 13.18 4.857e-ll 

(^fSI f.Stn RT mil 'XT\ 7 y17 0 1 /;i a 

10 700-710 


14 


BL00594 


Aromatic amino acids permeases 


BL00594A 16.75 6.531e-10 50-94 


15 


BL01047 


Heavy-metal-associated domain proteins. 


BL01047B 19.73 4.913e-13 707- 

79R 


16 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3.939e-15 
340-361 


18 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 3.700e-09 144- 
162 


20 


PR00741 


GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 


PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 

18 128-145 PR00741G 9.29 
z.l80e-17 jlo-340 PKU0741C 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 

89-105 PR00741E 13.39 3.535e- 

19 91 S 9^9 


22 


BL00I07 


Protein kinases ATP-binding region 

piUlCillS. 


BL00107A 18.39 3.6476-20 117- 
idR RT nniA7R T? "^i 1 nnnA ia 

182-198 






proteins. 


RT nnin7A is ■^q i /^nn<> 91 i9fi 
157 




RT oni 07 


riUlcul KllldScS Alr-Dmumg legion 

proteins. 


RT nn 1 n7 a 1 ft ■^o 1 Anno 9*5 1 9 ^ 
157 


27 


BL00239 


Receptor tyrosine kinase class U proteins. 


BL00239B 25.15 2.324e-16 91- 
139 


28 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 


29 


BL00018 


EF-hand calcium-binding domain 


BL00018 7.41 3.250e-10 681-694 



152 



wo 01/57190 



PCTAJSOl/04098 











ID 


NO. 






NO: 












proteins. 


BL00018 7 41 6 4one-imn-7'?n 




RI 01 1 




RT /^111^A 17 OQ 0 QAQa AO Oi 


33 


PD01168 


SYNTHETASE LIGASF PROTEIN 


PDni 1 (^XT 0 47 1 /^fi7p-no 4m 






ALANYI 


*T xU 


34 


PD01168 


SYNTHETASE LIGASE PROTFTN 


PDO 1 1 J!T 0 47 1 f\fnp^.(\Q 411- 






ALANYL. 


426 


36 


PR00426 










STGNATirRF 


197 




PF0ft70 1 
rr\i\j ly I 


j^uuiaiii prescni m ^aj- i oiiQ uucj-iikc 


PPnn701T3 7C /lO 7 A/10a 1 a 1 AOA 






Dsuux icccpiors. 


J. 1 J J 


1^ 

JO 




MADS-box domain proteins. 


Dl AAO CA OA ^A 1 AAAn Af\ 1 CC 

oI^uUjjU zU. /y 1.0UUe-4U lo5 






Alkabne phosphatase proteins. 


tJLOOlZjB ly.31 l.OOOe-40 90- 








133 Bb00123C 24.61 l.OOOe-40 








145-195 BL00123E 22.25 I.OOOe- 








4n QA^ l^C RT AAlO'S/^ 7/; Al 








1 AAAn ylA/lOO jIOO TIT AAIOOT? 

1.000e-4(J 436-400 BL00123F 








1 A AQ OTIyla ^C'i/Tv! OAA 

ly.U3 o./ 146-33 3o4-3yy 








T>T AA1 OTA 1AOAA AAA« O A CO TT 

i3L00123A lO.oO 9.000e-24 52-77 








RT AAl OTTl 1 O 11 1 AAA«. 1 O O 1 ^ 

13bUUi23JJ 12. /3 l.UUUe-17 21o- 








OOA 


AA 


rJJUUuoo 


JrKU 1 blN ZIN U-r INCjbR Mb 1 AL- 


PDOOOoo 13.92 2. 800e-14 346-359 






"DTXTTAT 


PDOOOoo 13.92 4.600e-14 486-499 








PDOOOoo 13.92 1.000e-I3 374-387 








rJJUUUOO I J.^Z O.UUUe-l J 4j5-4/i 








PIX)0066 13.92 2.714e-12 234-247 








x^UUUUoO 1 J.y-i J.143e-lZ 40U-443 








pr^nnn/^/i i"? 07 r 714o 17 si4 ^77 








PT^AAA/^^ 1 Q 07 1 770a 1 1 ylAO y1 1 C 

JrJJUUUOO iJ.yz 3./3ye-ll 4Uz-4l3 








PFinon^fi 1"^ 07 7 n^Rp i n ^ i r ■^'^i 






J Kw KJtlolo 1 AlNlv^b DbNUIVlYL, 


T\AyrAAAT3 A 0 1 TOO C\A£.^ 1 A 1 OA 

UMUUy /3A 21.1/ z.y4De- 1 U 1 oU- 






VT T n7R\X/ PVPT niTPYTA/lTriT? 


717 
zl / 


47 




G-protein coupled receptors family 2 


RT AA/CylA^ 1*7 OO 1 ^00« 1A ^OC 






proteins. 


CAl DT AA/^/1AD OA O OOO™ AA 

->U1 13LUUC)4y±> zO.oo /.3o/e-uy 








41 /-4D3 






JrKO 1 blN ZiNC-r INCjBk Mb 1 AL- 


PD00066 13.92 8.200e-l 6 445-458 








PD00066 13.92 5. 846e-15 305-318 








PD00066 13.92 l.OOOe-14 221-234 








PD00066 13.92 l.OOOe-14 417-430 








Pr^AAAiC/C 1 "2 AO O OAA«. lA'^ACX O^O 

rUUUUOD 13. yz z.oUUe-l4z4y-zoz 








PT^AAA/TiC 1 O AO O OAA^ 1 A OOO OAfV 

rJJOUUoo 13.92 z.ttOOe-14 277-290 








Pr^AAA/C^ 10 AO O OAAn. 1^1 O*!^ 

rUUUUoo 13.92 <$.oUUe- 14 333-345 








Pr^AAAiCX ^1 AO A ^AAn ^A 1£.t HA 

r'UUUUoo 13.y2 y.4U0e-14 301-374 








PT^AAAiCrfC 1*3 OO A AAAn 11 OOA ylAO 

rUUUUoo 13-yz 4.UUUe-13 3oy-4Uz 








Pr^AAA<#^ 1 1 07 /? ^71 o 1 7 All A C(C 

Jri-'uuuoo 1 j.yz O.J / le-iz 4 /3-4oo 


3 i 




IT £1 : 

Intermediate nlaments proteins. 


DT AAOO^T^ 1 A 1 A 1 f\f\f\^ Af\ >1 1 T 

jDb0022oD 19.10 l.OOOe-40 417- 








4A4 RT An77/;R 7*3 C*^ Q Q/IQa 1^ 








7ST 700 RT nA77*^r' 1 "3 7'3 1 470o 

zji-zyy jDJ-fUvzzoL^ ij.zj i.*tzye- 








74 147 rt ftn77AA 17 77 

ZH- J 10- / JDlyUuZZOA. IZ. / / 








1 R^7p 1 S 1 si If^A 








PPAA7 1 nr^ 1 A Q1 ^ ^/IQa AO 1 11 

j'lvuuzi/L' lu.yi j;o4oe-uy 133- 






QtnXT A XT TO 


1 /lO' 

I4y 






Cadherins extracellular repeat proteins 


DT A/\OOOD "^O OA 1 AAA-. >1A 1 

BL00232B 32.79 l.OOOe-40 143- 






domain proteins. 


lyl xJJLUUzJzA Z.j50e-28 
















252-300 BL00232C 10.65 6.625e- 








20 250-268 BL00232B 32.79 








1.314e-ll 367^15 BL00232C 








10.65 9.308e-10 470-488 


54 


BL00303 


S-lOO/ICaBP type calcium binding 


BL00303B 26.15 8.759e-23 125- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






protein. 


162 BL00303A 21.77 l.OOOe-21 


58 


PR00378 


INOSITOL PHOSPHATASE 
SIGNATIIRF 


PR00378D 16.86 l.OOOe-15 242- 

1 no. 190 


59 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 9.040e-12 120- 
140 


60 


BL00280 


Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 6.727e-38 238-282 
BL00280 24.61 1.5 14e-30 294-338 


65 


BL01019 


ADP-ribosylation factors family proteins. 


BL01019A 13.20 1.222e-ll 43-83 


68 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMJLY SIGNATURE 


PR00237E 13.03 5.09Ie-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
11 z4-49 PK002i7C 15.69 
3.057e-10 101-124 PR00237D 

o.y^ 4. /jUe-lU 13 l-ijy 

PR00237F 13.57 5.364e-10230- 
57-79 




pnn 1 ftfifi 

ruv i v/ot) 


FINGER METAL-BINDING NU. 


pr^ftiAAA 1 o /ii 7 0'3Qa oq 1 1 nf\ 
rUKJlKJOO ly.HJ /.yooe-zo 3 1- /U 


71 


PR00830 


ENDOPEPTBDASE LA (LON) SERINE 


PR00830A 8.41 8.759e-12 348- 


72 


BL00120 


Lipases, serine proteins. 


BL00120B 11.37 2.149e-10 148- 
163 


/ / 


risXfV /JJ 


1 - AMJIn Y CJLvJr KUr AIN 11- 1 - 

CARBOXYLATE SYNTHASE 


rK0075jb o.Ol 3.552e-ll 191- 
216 PR00753D 6.85 2.778e-09 

1 J 1-133 


78 


PR00506 


D21 CLASS N6 ADENINE-SPECIFIC 

FiKT A A/TPTHVT TR A'M^P'PP A 5P 
STGNATirRF 


PR00506C 19.40 8.017e-09 96- 

1 1 o 

1 ly 


82 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.5716-16436- 
467 


84 




OlgLUa'JH- llllClat^UlUil UUUlalll prULclilb 

ATP-binding region A proteins. 


DUJKiO /JA o.ouue-iu zDq- 
300 


85 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2J286e-30 117-160 


87 


BL00250 


TGF-beta family proteins. 


BL00250A 21.24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 

3ZO-J04 


01 


IJLj\J\JA I J 


Alitocnondnal energy transfer proteins. 


1JLUU21 DA 1 J.oz y.2jUe-17 10-3j . 

BL00215A 15.82 6.000e-16 221- 

OA/^ HT n/17 1 < A 1 ^ BO 7 Q<7*» 1 7 

108-133 BL00215B 10.44 9.526e- 

1 1 K^R.IRI 
11 lOo-Aol 


92 


BL00027 






95 


PR00094 


ADENYLATE KINASE SIGNATURE 


PR00094C 12.94 l.OOOe-08 119- 
li^> 


96 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR TMMTINOGT O 


PD02327B 19.842.0916-09 143- 

1 OJ 


97 


BL00752 


XPA nrntein 


RT nn7S9R 10 17 7 IflOp 00 9R 79 


. 98 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e-10 135- 
149 


99 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00I09B 12.27 9.824e-12 122- 
141 


100 


BL00027 


"Homeobox' domain proteins. 


BL00027 26.43 7.429e-31 118-161 


101 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 6.8706-12 370-387 
BL00028 16.07 6.8856-11 398-415 
BL00028 16.07 8.2696-11 342-359 
BL00028 16.07 4.3006-10 229-246 
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SEQ 
ID 
NO: 
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BL00028 16.07 6.100e-10 258-275 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PROO048A 10.52 
2.059e-12 609-623 PR00048A 
10.52 2.588e-12 469-483 
PR00048A 10.52 7.353e-12 553- 
567 PR00048A 10.52 2.895e-l 1 
525-539 PR00048A 10.52 4.31 6e- 
11 441-455 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2.1256-10 569-579 
PR0OU46B 6.02 4.93Se-10 513- 
523 PR00048A 10.52 5.696e- 10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B 6.02 
l.OOOe-09 457-467 PR00048B 
6.02 6.684e-09 485-495 


103 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 1 1.94 5.364e-22 31-50 
PR00195B 9.47 1.783e-21 56-74 
PR00195C 11.50 3.455e-21 126- 
144 PR00195D 11.76 8.714e-21 
175-194 PROOlySr 16.20 8.500e- 
20 217-237 PR00195E 9.82 
8.650e-20 194-211 


104 


BL01113 


Clq domain proteins. 


BL01113A 17.99 1.865e-09 121- 
148 DL01113A 17.99 5.8466-09 
82-109 


105 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 6.400e-l 1 70-99 

■DT f\{\A'^f\ A OAytOO COC« 1AT5 

t)L,UU4zOA zU.4.i 6.jz5e-10 /3- . 

102 BL00420A 20.42 5.708e-09 


108 


PR00860 


VERTEBRATE METALLOTfflONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 27-41 
PR00860A 5.46 5.500e-16 5-18 

rK.UU0DUCy.01 1.4/4e-14 4I-j1 


112 


BL01031 


Heat shock hsp20 proteins family profile. 


BL01031C 17.68 6.400e-10 122- 
147 


114 


DM01840 


kw SPAC24B1 1.09 R07E5.13. 


DM01 840B 22.04 2.688e-40 59- 
i\J3 JLiAa0l64uA 10.95 9.571e-13 
J 1-4 J 


115 


BL01126 


Elongation factor Ts proteins. 


BL01126A 18.48 2.3 17e-30 46-89 
BL01126B 13.15 7.3876-19 116- 
135 BL01126C 9.20 9.735e-ll 
190-203 


116 


BL00216 


Sugar transport proteins. 


BL00216B 27.644.3756-21 35-85 


118 
1 Xo 




Cataiase proximal heme-ligand proteins. 


Dl AA/f^TA lO OO 1 AAA— At\ 

BL00437A 18.82 l.OOOe-40 49- 
101 BL00437B 16.28 l.OOOe-40 
114-168 BL00437C 21.86 l.OOOe- 

An ion oon pt aa^oti-i i c m 

l.OOOe-40 248-301 BL00437E 

ZD.yj l.\}\J\)Q-H\j oJ.f'D /y 






TTni/*iiii^n /^Ofnnvxrl fOT.mmol Yi^i/^i'/Nlqoa 

family 1 cysteine activ. 


208 BL00140C 11 80 5 444e-10 
77-102 


120 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 6.712e-10 95- 
148 


122 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 l.OOOe-40 16-62 


123 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041D 7.95 2.906e-09 24-41 
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BINDING (CREB) PROTEIN 
SIGNATURE 




124 


PR00041 


CAMP RESPONSE ELEMENT 
BINDING fCREB'k PROTFIN 
SIGNATURE 


PR00041D 7.95 2.906e-09 24-41 


125 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061C 7.86 3.250e-10 212- 
222 


126 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.400e-25 251-290 


197 




SIGNATURE 


rKUUJloiJ 10.2o i.!yuue-J4 ziyr 
248 PR00318B 14.79 3.4556-27 

lOo-lyl rlvUUJlov^ iz.uy /.UUue- 

23 197-215 PR00318A7.84 
2.5006-12 265-275 


128 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 9.743e-10 67-89 
PR00927B 14.66 4.575e-09 69-91 


130 


BL00824 


Elongation factor 1 beta^eta'/delta chain 


BL00824B 9.21 7.750e-22 133- 


131 


BL00824 


Elongation factor 1 beta^eta'/delta chain 
pruLcuu). 


BL00824C 14.58 l.OOOe-40 166- 
204-239 BL00824B 9.21 7.750e- 
l.OOOe-1 9 247-263 


132 


PR00209 


ALPHA/BETA GLUDIN FAMILY 
SIGNATURE 


PR00209B4.88 9.222e-13 1209- 
1228 


133 


PR00209 


ALPHA/BETA GLL^iDIN FAMILY 
SIGNATURE 


PR00209B 4.88 9.222e-13 1 168- 
1187 


1 lA 


rlvUU /Uo 


ALiftiA- 1 -AL-JLU UL Y L-UrKU 1 tJJN 

SIGNATURE 


rKUU/U8U 14.0/ i.UUUe-Z/ 141- 

168 PR00708C 11.77 1.643e-25 
24 73-95 PR00708E 13.33 
14.40 2.636e-21 51-70 


135 


PR00109 


DOMAIN SIGNATURE 


145 


136 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 3.250e-10 201- 

917 


137 


BL00471 


Small cytokines (intercrine/chemokine) 

f^-Y— Qiih"Fjimilv cionsit 

w A, V> OUUXCUXi.llj' OlglldL. 


BL0047 1 23.92 7.480e- 1 0 42-90 


140 


PR00205 


CADHERIN SIGNATURE 


PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 11.39 9.018e-10 
543-561 


141 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 7.704e-09 976- 
1027 


143 


PR00979 


TAFAZZm SIGNATURE 


PR00979E 10.83 5.950e-26 192- 

914 PPftA070A 1 1 01 Q 77'2.a 

rKuuy/yA 11.71 o.//je-^D 
63-83 PR00979C 12.16 6.400e-19 

19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 


145 


DM00686 


kw REPLICATION REP 28K 17.7K. 


DM00686C14.14 7.720e-09 1 11- 
131 

XJ X 


146 


PR00604 


CLASS lA AND IB CYTOCHROME C 
SIGNATURE 


PR00604D 15.86 l.OOOe-17 87- 
104 PR00604B 12.73 9.591e-16 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 l.OOOe-11 
106-117 PR00604All.13 8.800e- 



156 



wo 01/57190 



PCTAJSO 1/04098 



SEQ 
ID 

NO- 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 44-52 PR00604F 8.60 l.OOOe- 
10 I'J^-l^'? 


147 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 3.864e-15 266- 
297 BL00107B 13.31 6.143e-ll 
335-351 


148 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 8.448e-09 67-81 




ritUDUoy 


ALDU-Kc i U K11UUI_ 1 Abb 
ITn'MATTrRlH 


risxjyjxjoyu ly.ot) l.oj/e-iU io/- 

41-66 PR00069E18.143.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
R 071p-io ini-i9n 

o.u/ic-iy lUl 1^1/ 


150 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 2.688e-27 139-182 


151 


PD02906 


SYNTHASE I PSEUDOURIDYLATE 

PCRTTnrMTPrriTKrc t vaqt? tp 
roJfcUJJL;UiNXL/iJNJD J_,i AoJi liv. 


PD02906C 24.17 7.070e-22 165- 
Jriju/yuoij ij.oD B.jyje-ij 

09 71-84 


1 J J 


RT flnd70 


r^iioroui csicrb / uiacy igxy ceroi Duiuing 
domain proteins. 


914 BL00479B 12.57 1.837e-l 1 

0 1 S-OT 1 


158 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-31 143-186 




■RT nn^oo 


Grsnins proteins. 


RT nA/177/^ 1 A 1 ft 7 7^rto 1 7 i47n_ 

448 


162 


PR00625 


DNAJ PROTEIN FAMILY 


PR00625A 12.84 9.297e-ll 62-82 


164 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 6.182e-10.347- 


166 


PR00860 


VERTEBRATE METALLOTHIONEIN 
SIGNATURE 


PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 l.OOOe-18 61-74" 
PR00860C9.61 1.900e- 15 97-107 


167 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.052e-09 196- 
218 


169 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.241e-34 
471-501 BL00514H 14.95 6.571e- 
27 510-535 BL00514E1428 

l.Z/je-lO JOO-4U3 iJLUUjl4L} 

15.35 9.100e-15 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F 11.65 9.6906-14 
416-431 BL00514A 11.68 8.200e- 
11 149-159 


170 


BL00514 


Fibrinogen beta and gamma chains C- 
tenninal domain proteins. 


BL00514C 17.41 1.346e-39 268- . 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 

l.z/je-lO J4U-JJ/ i3L.UUji4JJ 

15.35 9.100e-15 321-334 
BL005I4B 16.42 4.iS57e-14 212- 
228 BL00514F 11.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 
11101-111 


171 
1/1 


RT nn^i4 


r iDimogcn dcui dnu gaiuma CDams v^*- 
terminal domain proteins. 


RT AA<l^n 1 ^ Oft 7 741<» "XA 19.^ 

415 BL00514H 14.95 6.571e-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-319 BL00514D 
15.35 9.1006-15 283-296 
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BL00514B 16.42 4.857e-14 212- 
228 BL00514F 11.65 9.690e-14 
330-345 BL00514A 11.68 8.200e- 
11 101-111 


173 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.400e-29 1 19-162 


174 


- DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL m. 


DM01970B 8.60 5.119e-15 1391- 
1404 


176 


BL00773 


Chitinases femily 19 proteins. 


BL00773C 9.42 8.000e-092-16 


182 


PROG 109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9.163e-14 141- 
160 


183 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA-. 


PD01937A 6.68 3.475e-09 221- 
232 


185 


BL00845 


CAP-Gly domain proteins. 


BL00845 16 43 2 946e-23 247-279 

XJX.iyj\JV^J 1 \J / AtJ £m^ 1 ^ 1 £t 

BL00845 16.43 1.628e-21 107-132 


186 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11 65 6 53 8e- 11 525- 
541 


187 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 1 1 65 6 538e-1 1 497- 
513 


18S 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN H. 


DM01803A 10.51 l.OOOe-09 
1081-1102 


189 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 5.091e-15 69-82 


190 


PR 00 194 


TROPOMYOSrM STGNATirRF 


174 PR00194E 8.74 3.250e-30 
231-257 PR00194D9.57 1.500e- 

1 t J lyy rJWjKj lyHD lu.it 

5.200e-24 120-141 PR06l94A 
7 86 4 857e-91 84-1 0'' 


192 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.154e-09 131- 
146 PD02042A 91 13 5 OOOe-OO 
94-121 


193 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A431 2200e-102-15 


195 


BL00463 


Fungal Zn(2)-Cys(6) binucleai cluster 
domain proteins. 


BL00463 8 22 5 071e-09 111-123 

xjj^jv wT^v^/ \it*rXr ij i\j 1 X \^ yjj/ XX x xittj 


196 


PR00118 


BETA-LACTAMASE CLASS A 
SIGNATURE 


PR001 IRF 16 47 9 "^Rfip-OO IfiS- 
181 


197 


DM00215 


PROLINE-RICH PROTEIN 3 

X. xy-^^XmlXJi,^!^ J.XX>_<XA X ' ff ' Jill 


nM0091 5 10 4'? S 494p-n0 914. 

267 


198 . 


BL00660 


Band 4.1 family domain proteins. 


BL00660A 31.50 5.500e-l 1 714- 
767 


199 


BL00282 


Kazal serine protease inhibitors famUy 
proteins. 


BL00282 16.88 8.820e-13 70-93 


202 


PR00009 


TYPE I EGF SIGNATURE 


PR00009A 14 15 5 345e-l 5 071- 
987 PR00009C 14.11 8.773e-13 
996-1008 PR00009D 16 83 ■ 
8.000e-ll 1008-1018 PR00009C 
14.11 1.882e-09 892-904 


203 


BL00025 


P-type 'Trefoil' domain proteins. 


BL00025 17.17 4.536e-19 38-59 


205 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7 41 7 300e-10 165-178 

X^Xjv vv X \J I X f *^ V v W X \J ' Xv^ X 9 V 


206 


PR00168 


SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 


PR00168D 12.88 6.865e-ll 67-86 


207 


BL00025 


P-type 'Trefoil' domain proteins. 


BL00025 17.17 3.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 


209 


BL00646 


Ribosomal protein S13 proteins. 


BL00646B 21.42 6.100e-30 110- 
143 BL00646A 25.82 6.192e-29 
14-62 


210 


PR00138 


MATRDON SIGNATURE 


PR00138D 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR00138E 6.01 8.714e- 
13 314-328 PR00138A 15.14 

15.82 4.522e-12 188-204 


211 


DM01206 


CORONA VIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.429e-12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.0686-10388-408 


212 


PD01941 


TRANSMEMBRANE 
COTRANSPORTER SYMP. 


PD01941A 14.81 l.OOOe-40 163- 
217 PD01941B 15.02 9.705e-30 

zj 0J/-00'* rUKJiy^^ii^ ly.yo 
8.200e-20 508-563 PD01941D 
27.18 1.6006-16 661-710 
prim 041 P 9R ^9 0 f^^t^ 1^1 ftn^ 

1060 


213 


BL00362 


Ribosomal protein S15 proteins. 


BL00362 24.67 8.3 13e-09 330-373 




tJT Afll 1 ^ 
DLAJVl ID 


jQiiKaryonc jl\Jn/\ polymerase ii 
heptapeptide repeat proteins. 


1227 BL00115Z 3.12 6.0966-09 
1164-1213 


91 
^1 J 


I>Lt\J\}\J JO 


Myc-type, "helix-loop-helix' dimerization 
domain proteins. 


iSLiVVuioD io.y/ /.ouue-io 123- 
146 BLO0P38A 13.61 1.474e-13 
1 no n R 

1 UZ- 1 i 0 


216 


BL01108 


Ribosomal protein L24 proteins. 


BL01108A 20.33 2.241e-22 49-82 
BL01108B 1 1.40 8.457e-10 96- 
107 


217 


PR00381 


KINESIN LIGHT CHAIN SIGNATURE 


PR00381A 9.55 1.321e-10 360- 
378 


222 


BL00514 


Fibrinogen beta and gamma chains C- 
terminal domain proteins. 


BL00514C 17.41 2.358e-26 1166- 

1 OAO l^T AAC 1 A/~^ 1 C AO A AAA— 1 C 

120i BJLOU514Cj 15.98 9.U00e-15 
1289-1319 BL00514D 15.35 
6.9366-12 1207-1220 BL00514F 

BL00514H 14.95 8.636e-I0 1318- 


223 






DLi\}\JjZ.JD ZX.OO 1.UUUC-4U 73- 

139 BL00325A 24.83 9.3336-24 
61-93 


224 


BL00018 


EF-hand calcium-binding domain 

piL/LClila. 


.BL00018 7.41 1.450e-10 23 1-244 


225 


PF01329 


Pterin 4 alpha carbiiiolamine dhydratase. 


PF01329B 18.52 1.692e-18 67-92 


228 




r\jj\-/ uoiibpurLcrs iaiuiiy proccjiio. 


1065 BL00211B 13.37 8.875e-18 
1.9006-09 931-943 


230 


PR00761 


BINDIN PRECURSOR SIGNATURE 


PR00761A 5.81 9.366e-09 275- 

909 


231 


PR00049 


WILM'S TUMOUR PROTEIN 


PR00049D 0.00 3.500e-10 54-69 


232 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 1.978e-10 109- 

lOU oLiUvHLZiJ 10.34 4.1ZZc-Uy 

133-184 


233 


BL01210 


Caveolins Drotein«i 


RI m210R n 99 8 19Qe-00 106- 
156 


236 


BL00939 


Ribosomal protein Lie proteins. 


BL00939F 17.27 5.393e-09 861- 
891 


238 


BL01252 


Endogenous opioids neuropeptides 
precursors proteins. 


BL01252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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37-67 BL01252C 18.10 1.621e-2I 
164-190 BL01252A 14.22 7.107e- 
18 14-34 


239 


BL00302 


Eukaryotic initiation factor 5A hypusine 
nrntelns 


BL00302 14.81 1 .OOOe-40 25-79 


240 


PR00420 


AROMATIC-RING HYDROXYLASE 
fFLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 8.851e-13 26-49 


241 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e-09 235- 
289 


243 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDTNG NU 


PD01066 19.43 8.527e-25 11-50 


244 


BL01270 


Band 7 protein family proteins. 


BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.74 6.857e-]7 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 


245 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 6.305e-12253- 
308 PF00791B 28.49 1.909e-ll 
427-482 PF00791B 28.49 2.65 le- 
09 179-234 PF00791B 28.49 


246 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 2.500e-13 277-290 
PD00066 13.92 9.143e-12 193-206 
PD00066 13.92 5;304e-l 1 165-178 
PD00066 13.92 6.478e-l 1 249-262 


247 


BL00406 


Actins proteins. 


BL00406D 12.58 6.400e-20 465- 
520 BL00406B 5.47 4.857e-14 
249-304 BL00406E 8.44 l.OOOe- 
1 1 522-572 BL00406C 6.75 


248 


BL00951 


ER lumen protein retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe-40 1 12- 
161 BL00951A 15.10 7.750e-39 
21-57 BL0095 ID 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 
31 57-88 


252 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.994.818e-14 
194-221 BL01113A 17.99 7.818e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BLOl 113A 17.99 6.077e-12 203- 
230 BLOl 113A 17.99 9.1826-11 

■ 1 70 OCiA Til f\1 1 1^ A IT OO 0 
■10 1 7/w9n^ "RT m 11 A 1 7 00 

9.043e-10 218-245 BL01113A 

1 7 go Q 49fift.l n 900-9^*? 
1 1 ^yy 7 ."T^ vv— 1 V X v7^j D 

BL01113A 17.99 4.1156-09 137- 
164 


257 


RT 0084S 






259 


PR00248 


METABOTROPIC GLUTAMATE 
GPCR SIGNATURE . 


PR00248G 12.672.6886-0953-78 


260 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 441-452 
BL00678 9.67 5.800e-10 481-492 
BL00678 9.67 8.800e-10 358-369 


261 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 
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RL00678 9 67 8 SftOp-l fl no 1/11 


262 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 3.400e-10 468-479 
BL00678 9 67 5 800p-in <!nR ^10 

BL00678 9.67 8.800e-10 385-396 


263 


BL50002 


Src homoloCTv 3 ^'SH'^'i domain nrotpin^ 
profile. 


BT 50009R 1 5 IX •? OODp-lfl 41^ 
429 


264 


BL00049 


Rihosomal nrotein T 14 nrotein^ 


130 


265 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2. 09 le- 14 438-470 


266 


PD01469 


GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 


PD01469 20.59 2.091e-14 279-31 1 


267 


BL00567 


PliriQnlinriKiilnlf inacp nrntf^inc 

JTllUOUliUl IL/UlU^JJLlClOC UlULCULlO. 




269 


BL00049 


T^ihncnnnnl timtpfn T 14. ■nrrtfpinc 


jji^\j\j\i*^y\^ I I.JO z.Dooc-zo 

128 BL00049B 18.42 6.806e-24 
54-86 RT 00040 A 11 86 R 111p-10 

19-42 BL00049D 13.47 5.765e-12 
129-140 


212 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 9.735e-12 14-58 


273 


PR ftnn? 1 


SIGNATURE 


pp nnrto lAA"^! 1 Olio no c i o 
832 


275 


PR00I79 


LEPOCALIN SIGNATURE 


PR00179B 9.56 2.895e-13 124- 
137 PR00l'79A 13.78 3J250e-l 1 
36-49 PR00179C 19.02 6.040e- 11 

1 J^^l /u 


276 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 8.364e-17 22-44 
PR00449C 17.27 l.OOOe-13 62-85 
PR00449E 13.50 4.000e-12 172- 
195 PR00449B 14.34 5.680e-10 


277 


BL00140 


Ubiquitin carboxyl-terminal hydrolase 

xaXLilXy X CydLClliC dVliV. 


BL00140D 22.64 l.OOOe-40 161- 
■ on^ RT fimj.nr' ti Jif» o n^'^o "in 

ZVJ DLiW l'r\J\^ l I .o\J y.\JjJC-j\J 

79-104 RTOmdOA I^O^iOAftftp- 

28 5-35 BL00140B 12.29 4.649e- 
1737-55 


278 


PD02712 


ELEMENT TRANSPOSASE FOR 
TRANSPOSON TRANSPOSABLE. 


PD02712A 23 03 8 01 1e-09 47-81 


279 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1.474e-09 100-111 


282 


DM00892 


3 RETROVIRAL PROTEINASE 


nM00809P 91 5S 4 IfHp-') 1 8fi4- 

898 


283 


BL00048 


Protamine PI tirotein^ 


RT 00048 fi 10 9 5S0p-00 "16-81 


286 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 1.878e-ll 36-54 


287 


PR00310 


ANTI-PROLIFERATrVE PROTEIN 

iXl ^ X X X XW^XjXX X^X\.I^ XXV 1—/ X XWJ X ' ■' ' ^ 

BTG 1 FAMILY SIGNATURE 


P1?00110R 10 50 4 91 1p-17 90-^0 

PR00310D 9.10 6.679e-16 89-1 19 


289 


PD01066 


PROTEIN TTNC FTMfiFR 7TKSC. 
FESIGER METAL-BINDING NU. 




293 


BL00979 


G-protein coupled receptors family 3 
nrnteiri'! 


BL00979L 20.63 3.800e-12 1 1 1- 

1 "59 

I 


295 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD0241 1 21.89 7.000e-16 195-229 


296 


BL01064 


Pyridoxamine 5'-phosphate oxidase 
proteins. 


BL01064A 27.84 8.313e-28 77- 
129 BL01064C 15 22 7 136e-25 
202-235 


297 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 2.929e-13 37-56 
BL00030B 7.03 l'.900e-ll 167- 
177 BL00030A 14.39 2.000e-10 
128-147 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION - 


RESULTS* 


298 


BL01183 


ubiE/COQ5 methyltransferase family 
proteins. 


BL01183B 21.31 6.660e-12 143- 
188 


299 


BL01279 


Protein-L-isoasnartateTD-a^martatp'^ O- 
methyltransferase signa. 


BL01279A 94 27 S 8/^9^-1 1 'in 
105 


301 


BL00191 


Cytochrome b5 family, heme-binding 
domain proteins. 


BL00191K 17:38 4 951P-97 184- 
228 BL00191J 1 1.37 6.447e-17 
128-150 


302 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 3.893e-16 33-67 


306 


PF01140 


Matrix protein (MA), pi 5. 


PF01140D 15.54 2.988e-09 416- 
451 








254 PR00245D 10.47 4.000e- 15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 

S 71ilf>-19 ^Ol.^nfi 


309 


- BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 2.245e-10 612-658 


310 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 7.632e-23 119- 
159 BL00237C 13.19 3.864e-15 
251-278 BL00237D 11.23 3.739e- 
12 312-329 


311 


BL00380 


Rhodanese proteins. 


BL00380D 15.90 8.200e-28 110- 
136 BL00380G 11.26 5.800e-16 
ZO/-ZJSU tsLOUjoOD 14. // /.OOOe- 
14 49-62 BL00380F 9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.387e-13 82-98 BL00380E 12.44 
7.000e-ll 181-193 BL00380A 
10.48 l.OOOe-09 10-20 


312 


BL00227 


Tubulin subunits alpha, beta, and gamma 
pruLcins. 


BL00227B 19.29 1 .OOOe-40 50- . 

LVJ DIjUOZZ l.UUUe-4U 

111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
1.000e-40 372-426, BL00227A 
24.55 3.250e-39 1-35 BL00227E . 
24.15 8.5006-34 324-359 


327 


BL00232 


Cadherins extracellular repeat proteins 
domain proteins. 


BL00232B 32.79 7.362e-21 225- 
273 BL00232B32.792.588e-]7 
435^83 BL00232B 32.79 6.301e- 

1 ^ 1 1 A 1 TJT AnTlOD TO 

1 J 1 1 0- 1 04 xjJLrUUZjZD iA.ly 

6.769e- 13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-ll 328- 

433-451 


329 


PD02749 


TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 


PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.000e-15 2- 
1 ^ 


330 


PR00391 


PHOSPHATIDYLINOSITOL 

TRATsTi^FPR PROTPTKr <IT(TMATTrRP 


PR00391E 12.50 7.7856-15211- 
83-104 PR00391D 12.21 9.328e- 

1'^ 101 907 PRnmOl A 1 R'X 

5.3906-11 16-36 


332 


BL01030 


RNA polymerases M / 15 Kd subunits 
proteins. 


BL01030 23.44 1.818e-23 87-125 


337 


PD01066 


PROTEIN ZUsfC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.929e-32 6-45 


340 


PD02711 


SYNTHASE 


PD02711B 14.26 1.973e-20 944- 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHORIBOSYLFORMYLGLY. 


968 


343 


BL00223 


Annexins repeat proteins domain ' 
proteins. 


BL00223C 24.79 1 .OOOe-40 245- 
300 BL00223B 28.47 8.714e-38 

27 98-132 BL00223A 15.59 
8.750e-27 26-60 BL00223C 24.79 
9.438e-16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-l 1258-292 


346 


PR00345 




PR00345E 8.54 7.652e-28 158- 
110-134 PR00345D 10.97 1.964e- 
5.6456-16 52-71 


347 


BL00586 


Ribosomal orotein T.16 nrotein'! 


BT n0586R 1 7 on 1 9T5p-1 S 1R4- 

221 


348 


PR00388 


3',5'-CYCLIC NUCLEOTIDE CLASS U 
PHOSPHODIESTERASE SIGNATURE 


PR00388A 10.45 2.778e-09 86- 
105 


351 


BL00018 


EF-haud calcium-binding domain 
proteins. 


BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-10 244-257 


354 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 1 .947e-09 256-267 


358 


DM01206 


CORONAVIRUS NUCLEOCAPSID 


DM01206B 10.69 3.278e-09 175- . 

183-203 DM01206B 10.69 
o.ojje-i/y uz-iDz L'iviuIzUoJd 
10.69 8.861e-09 181-201 

nx/Tft 1 oVifSK 1 n /^o o i no i nn 
uiviui-<LUOjj lu.o" ".^ loe-u" \ 1 1' 

197 


361 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP 


PD01498C 24.90 6.880e-14 219- 
263 


362 


PD01498 


OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 


PDO 1498C 24.90 6.880e- 14 219- 
263 


365 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B7.il l.OOOe-1 1 589- 
600 BL00178A 14.23 8.500e-09 
46-56 


366 




ouiiauucs pruicinb. 


TIT f\f\^'J1'C 1 O 0*7 1 AAAa m ^ 1 O 

jdjluujzjji 1 .uuue-z3 j 1 o- 
348 BL00523A 13.36 5.500e-16 

129-140 BL00523G 9.46 5.500e- 
10 506-516 


369 


BLOC 107 


Protein kinoes ATP-binding region 
proteins. 


BI00107A 1 8 19 4 R18p.no 91-';9 


370 


BL00880 


Acyl-CoA-binding protein. 


B.L00880 17.52 l.OGOe-40 75-125 


371 


BL00107 


Protein kinases ATP-binding region 


BL00107A 18.39 l.OOOe-23 276- 

107 RT nniHTR 11111 (?Q9p 19 

342-358 


372 


PR00211 


GI IITFT IN SIGNATTIRF 


rswyjzL lo u.oo o.ouzc-i i jZo- 
347 PR00211B0.86 6.106e-10 

190-141 PR0n9nR 0 R/; 1 l/i7p 

09 333-354 


373 


BL00279 


M^embrane attaclc cnmnlpv mmnnnprit*! / 
perforin proteins. 


RT nn97QF 17 1 1 0 140p.in 74Q- 

797 


375 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1231e-33 10-49 


377 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.563e-28 10-49 


379 


BL00598 


Chrome domain proteins. 


BL00598 14.45 5.781e-16 3-25 
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NO: 


NO. 










WAT DAPTn 

DEHALOGENASE/EPOXIDE 

HYDROT A<5F FAMTT Y STGNATirRF 


878 


383 


PR00413 


HALOACID 

nFTIAT OTiT^ASFypPnYTDF 
HYDROLASE FAlVflLY SIGNATURE 


PR00413D 11.28 8.941e-09 864- 

R7R 


387 


RT 01060 


X^lagClla UoilDpUli JJIULCili 1 1 Ir i.aUllJjr 

proteins. 


RT oiofiOA 1 ^ 1 no T31 
174 


388 


PR 00909 


Al PHA/RFTA fiT TAnVN FAMTT Y - 

SIGNATURE 


PR oo9noR 4RRA'^iRo 11 1 ono 
1028 


joy 


PR00R^7 


AT T FTJf?T7>J VS/TPV-1 FAA/fTT Y 

SIGNATURE 


483 


jy i 


RT 00940 


XVCLfCjylUl lyiUDliiC JtVilla£>C Ulcuo ill 

proteins. 


142 




PR00014 


FTRT? n>JFPTnsJ TVPP TTI RFPFAT 


PR nnn t at\ i o ha r ii i i n i 

/ UO 


393 


PR00014 


FIBRONECnN TYPE ID REPEAT 
SIGNATURE 


PR00014D 12.04 8.412e-10 706- 
721 


394 


BL01209 


LDL-receptor class A (LDLRA) domain 
proteins. 


BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 






Ribosomal protein L30 proteins. 


t5JLUUoJ4 j4.jo 4.09Ue-lj 70-121 


396 


BL01013 


Oxysterol-binding protein family 
proteins. 


BL01013D 26.81 8.000e-26 358- 
402 BL01013A25.14 7.231e-21 
45-81 BL01013C9.97 l.OOOe-13 
132-142 iiLOlOlJB 11.33 l.OOOe- 
11110-121 . 


TOT 




Penplierin / rom- 1 proteins. 


DT AAmAC IT OA 1 AAA_ Af\ CC f*'^ 

BLOOyjOt 17.80 1.000e-40 56-92 
BL00930D 9.124.6326-37 12-56 
BLrUuyjur lo.y 1 z.ouue-jo yz- 
133 


400 


PR00780 


LEUSERPIN 2 SIGNATURE 


PR00780B 4.89 4.491e-09 262- 
285 




PROORl 0 


SIGNATURE 


T>i?nr)Q ion \(\ 0*3 1 icQa 11 A OA 
rKUUoiyjD iU.oJ /.I joe-ll 4-iU 


403 


BL0038] 


Endopeptidase Clp serine proteins. 


BL00381C 23.84 1.250e-32 150- 

iy4 ULUUiolA 10.46 Z.2Soe-i2 
7i4 111 RT AmQlR 71 >17 Q I'^dex 


405 


BL01105 


Ribosomal protein L35Ae proteins. 


BL01105A 17.37 1 .OOOe-40 4-49 
RT m insR 19 o*; 1 ofto<»_/in ar 

108 


406 


BL00344 


GATA-type zinc finger domain proteins. 


BL00344 17.99 7.000e-12 814-852 


407 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 9.750e-09 73-94 


409 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 


PR00910A 2.51 4.321e-09 9-22 


410 


BL00762 


WHEP-TEIS domain proteins. 


BL00762A 23.43 l.OOOe-28 752- 
789 BL00762A 23.43 4.400e-21 

yUo-y4U tsLUO/OzA Zj.4j 5.4136- 

18 825-862 BL00762B 16.14 
8.759e-12 1154-1168 


412 


BL00690 


DEAH-box subfamily ATP-dependent 
helicases proteins. 


BL00690B 13.38 5.320e-15 262- 
280 BL00690A 6.87 1.8186-13 
230-240 


41 S 


RT 00997 


' 1 'n ttn 1 ir\ i nii ni'f'^ O Inn o no^o ^n /~i ^'^n^ n^ ^ 

1 UDUlul SUDUIUu oipna, OcLo, oIlQ golllina 

proteins. 


RT nn777R 10 70 1 AHAo /tA ^7 

107 BL00227C 25.48 l.OOOe-40 
113-165 BL00227D 18.46 l.OOOe- 
40222-276 BL00227F21.16 
1. OOOe-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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ill 
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DEsCKUPTlON 


KJLaULTS 








BL00227A 24.55 l.OOOe-33 1-35 


416 


PF00992 


Troponin. 


PF00992A 16.67 1.711e-09 557- 


418 


r>L>UU j4 1 


- ■ ■ 

Nuclear transition protein 1 proteins. 




419 


BL00541 


Nuclear transition protein 1 proteins. 


BL00541 8.44 9.875e-09 197-251 


420 


FrOOoSo 


SET domain proteins. 


rruoojoA 20.14 y.u/4e-ij yoi- 
938 PF00856B 16.42 2.397e-12 

QCl O'7'l 


421 


ULUUd/o 


Tip- Asp (WD) repeat proteins proteins. 


cSIjUUO/o y.O/ 6.2l;Ue-12 jJ-44 


423 


rUUlUoo 


rKUltLN Z,llNU rllNUJDK Z,1JNL-- 
FINGER METAL-BINDING NU. 


1>F\A1 AAA 1 O y1 "3 0 AAAa ^ A 1 '*A 1 AA 


424 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 1.305e-17 421- , 

A T1 

472 


426 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


427 


. PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e-12 3-21 


428 


BL00478 


LIM domain proteins. 


BL00478B 14.79 3.250e-13 115- 
130 BL00478B 14.79 9.036e-13 
50-65 


431 


BL00282 


Kazal serine protease inhibitors family 
proteins. 


BL00282 16.88 8.875e-12 464-487 


432 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.6176-12 
125-151 PDOO930B 33.72 2.521e- 
10 214-255 


433 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.649e-34 34-73 


434 


PR00449 


TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 


PR00449A 13.20 7.563e-ll 56-78 


436 


PROG 120 


H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e-19 705- 
722 


437 


BL00115 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BLOOl 15T 8.45 7.273e-29 1208- 
1242 BLOOl 15Q 18.08 2.776e-21 
953-983 BLOOl 15Y 1 1.86 S.OOOe- 
17 1604-1650 BL00115M 19.19 
8.130e-16 73 1-774 BL00115H 
14.34 9.3926-16463-496 
BLOOl 15A 15.44 7.4146-15 43-82 
BLOOl 15R 6.50 6.128e-14 983- 
1010 BLOOl 15J 16.71 9.289e-14 
591-617 BLOOl 151 8.33 4.336e- 
13 535-590 BLOOl 15L 12.25 
5.9396-13 662-694 BLOOl 15G 
11.65 6.0116-13 435-463 
BLOOl 15K 15.03 3.417e-10 617- 
659 BLOOl 150 16.76 5.8056-10 
863-913 BLOOl 15P 1 1.54 7.5386- 

lAAl^ ACO "DT AA11CC lOlvi 

10 913-953 BLOOllDS 18.24 
7.9686-10 1010-1052 BLOOl 15U 
10.34 4.4756-09 1242-1265 


438 


PF00628 


PHD-finger. 


PF00628 15.844.5366-10219-234 


440 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.351e-34 10-49 


441 


PR00309 


ARRESTIN SIGNATURE 


PR00309A 9.68 5.250e-24 32-55 

riTiAA^AArx n aa a A'30« n oaa 

PR00309D 7.09 4.93be-23 290- 
309 PR00309B 7.81 2.8006-21 
69-88 PR00309C 8.22 1.6216-19 
165-1 83 PR00309E 9.82 9.438e- 
15 374-389 


442 


BL00600 


Aminotransferases class-Hi pyridoxal- 


BL00600B 19.60 7.324e-14 103- 
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m 
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NO. 
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RESULTS* 






phosphate attachment si. 


129 BL00600G 12.43 2.125e-12 
306-325 BL00600F 8 77 8 IOSp- 
12 271-284 BL00600E 16.43 
3.167e-ll 228-257 BL00600D 
8.71 8. 650e-09 207-221 


443 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 3.160e-18 69-87 


444 


BL00349 


CTF/NF-I proteins. 


BL00349A 10.07 l.OOOe-40 8-54 
BL00349C9.33 l.OOOe-40 82-125 
BL00349E 10.79 l.OOOe-40 152- 

213-255 BL00349H 15.70 7.387e- 
36 361-399 BL00349B 10.51 
2 227e-34 54-82 BL00349D 1 1 70 
9.100e-34 125-152 BL00349G 
19 72 5 781e-30 323-356 


445 


BLOC 154 


E1-E2 ATPases phosphorylation site 
proteins. 


BL00154F 8.23 8.941e-21 271- 
295 BL00154E 20.37 2.620e-15 
124-165 


448 


•DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.8826-11 82-115 


451 


BL01283 


T-box domain proteins. 


BL01283A 24.15 3.100e^0 112- 
160 BL01283D 11.70 6.000e-39 

38 170-212 BL01283C 13.05 
7.750e-I9 222-236 


452 


PR00420 


AROMATIC-RING HYDROXYLASE 
MONOOXYGENASE) SIGNATURE 


PR00420A 14.78 2.579e-l 1 3-26 


453 


PROD 162 


SIGNATURE 


PRnni^>9R 19 77 7 490p.l7 91 S 

228 PR00162A9.35 2.324e-14 
193-205 PR00162C 8 10 7 ]20e- 
14 227-240 


454 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PDOIOfifi 19 43 7 OOOe-IO 87-19fi 


456 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.333e-18 1149- 
1 192 • 


457 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.737e-24 16-55 


459 


BL00290 


TmnriiiiifiplnhiiliTi^ anH inaifir 

histocompatibility complex proteins. 


RI nn90nA 90 RO l S90p-14 1S4. 

177 BL00290B 13.17 9.000e-12 
214-232 


460 


,PR00413 


HALOACID 

DEHALOGENASE/EPOXDDE 
HYDROLASE FAMILY SIGNATURE 


PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 15 78 5 714e-09 
175-192 


463 


PR00759 


BASIC PROTEASE fKUNITZ-TYPE"> 

VkJXV«^ X XX.^^ A JL^X I XlkV^X Ji X^ XXX x^ 1 

INHIBITOR FAMILY SIGNATURE 


PR00759B 1 1 26 R ■?8'5e-09 Ti-R*? 


466 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 4.200e-19 300- 
330 


467 


BL00019 


Actinin-type actin-binding domain 

■nrrit(>inQ 


BL00019D 15.33 4.200e-19 300- 


469 


PR00153 


CYCLOPHILIN PEPTIDYL-PROLYL 

nS-TR A'N<? T<?OMFR A <?F 

SIGNATURE 


PR00153D 11.99 3.2506-15 510- 
S97 punni^'^r' ii ni 4fiR9f» i4 

495-511 PR00153E 9.10 8.5486- 
14 523-539 PR00153B 11.57 
1.720e-13 452-465 


470 


BL00491 


Aminopeptidase P and proline 
dipeptidase proteins. 


BL0049IC 12.15 3.9126-09 557- 
572 


471 


PD00289 


PROTEIN SHS DOMAIN REPEAT 


PD00289 9.97 l.OOOe-14 1482- 
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RESULTS* 






PRESYNA. 


1496 PD00289 9.97 8.650e-ll 
1122-1136 


474 


BL50040 


Elongation fector 1 gamma chain profile. 


BL50040D 17.41 l.OOOe-40 279- 

oZy x>i-.jUU4Un lo.iy l.UUUe-4U 

40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 


H ID 


RT m 1 AA 


Ribosomal protein L3 1 e protems. 




476 


PR00007 


COMPLEMENT CIQ DOMAIN 

OlvJlNAlUlVC 


PR00007C 15.60 2.421e-21 589- 

Oil rKUUUU/li 14. 10 J.JUOe-^l 

544-564 PR00007A 19.33 6.897e- 
on ^AA x>\i f\(\(\f\nT\ o fiA 

6.571e-12 623-634 


477 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 5.846e-10 170- 
189 


479 


DM01970 


0kwZK632.12 YDR313C 
ENDOSOMAL ID. 


DM01970B 8.60 9.500e-17 967- 
980 




rKUUoOO 


DNA-rOLYMbKASE rAMlLY A (POL 

T\ O r/"" XT A TT TD 17 

1^ MljJN A 1 UKb 


PROO808C 13.76 5.688e-17 284- 
308 rROuooSA 16.33 3.186e-13 
224-247 PR00868H 12.51 3.388e- 

li 4j1-446 rKUUooSl 10.6/ 

7.938e-ll 462-476 PR00868E 

Ij.ly 1 .OUoC-lU J4U-JOO 


481 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.182e-22 53-96 


482 


BL00061 


Short-chain dehydrogenases/reductases 
family proteins. 


BL00061B 25.79 3.6476-21 188- 


483 


BL50002 


Src homology 3 (SH3) domain proteins 
profile. 


BL50002A 14.19 1.750e-12 1032- 
1051 






Ank repeat protems. 


TJCAAAOO A T JZ AO A £0 C« 1 A TiCA 

776 PF00023A 16.03 3.571e-09 
715-731 


486 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR 


PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 


487 


PR00370 


FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 

O T/— "XT A TTT m 

blGNA 1 URE 


PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 l.OOOe-24 
27-46 PR00370C 12.72 4.000e-21 
140-157 PR00370E 11.96 9.229e- 
21 3Z\)-33y rK00370D 16.33 
l./jUe-ZU loj-ZU4 JtKUUj /Ur 

17 7^ 7 "^O^p-Ofi "^7^ "^0^ 
1 /./J /.jyjo-zu J ij-jyj 


489 


PD01675 


GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. 


PD01675C 19.89 2.330e- 10 55-89 


492 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 45-57 




RT Oft? 1 1 


A 'trfincnArt'Ai'c ■fiimilv nrrtfAino 
f\L>\^ U aito|JUi Id & loJIlllj piULCilla. 




494 


BL00211 


ABC transporters family proteins. 


BL00211A 12.23 5.050e-09 58-70 


495 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9. 143e-12 319-362 
BL00027 26.43 2.600e-ll 627-670 

T5T A A ATT 1£ A'J 1 £1C ^ 1 A T7A OO'^ 

dUjOvZ/ 26.43 3.o25e-10 779-o22 


497 




T^r/lfpiTl VlTljjCAC A ' 1 '|J_Vi!n/lin a Te^cnrxTt 

proteins. 


D1~AJ\J L\J 1 /\. J.OUUC-.ZZ ilf- 

245 BL00107B 13.31 l.OOOe-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 


499 


BL00383 


Tyrosine specific protein phosphatases 


BL00383E 10.35 l.OOOe-14 1902- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins. 


1913 BL00383D 11.92 3.0776-14 
160id-lo/j rSJLUUjSjA 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
i7J0 dlajKjoood /.oi i.oyze-ll 
1755-1764 


501 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e-09 136- 
150 PR00019A 11.19 1.667e-09 
91-105 PR00019B 11.36 4.600e- 
09160-174 


503 


BL00226 


Intermediate filaments proteins. 


BL00226D 19.10 l.OOOe-40 367- 
414 BL00226B23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 

Z.OUUe-lj JUy-i4U iiLUU22oC 

13.23 6.143e-12 266-297 
BL00226B 23.86 1.209e-09-146- 


SOS 




■^-RTQPWOQPHOfiT VPFRATR 

TTyrnFPP'NrriFTJT Pnn<iPT-ir>r;i vpcr 


ruuz4u/r /.oi o. /jye-uy yio- 


506 




JTLD^ I -uoiuaiii ^uoit|Uiiin~iransicrase^. 


1023 PF00632B 18.45 1.155e-ll 
940-968 


507 


BL01082 


Ribosomal protein L7Ae proteins. 


BL01082 20.37 4.273e-20 76-1 16 


508 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.421e-09 493-504 


509 


BL00678 


Trp-Asp (WD) repeat proteins proteins. . 


BL00678 9.67 2.421e-09 473-484 


510 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 4.774e-ll 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
/.olbe-lO 64o-ool PKU0320A 

1 A 'T/l Q y11 AO Ifi'X inQ 

ppnn'^9nA i 74 o/^Rp" no <;a7 

JTIVUvJZO/t. 10./*t O.ZO0C-UJ7 jO/- 

582 


511 


BL00479 


domain proteins. 


RT AA47QP T7 AT 0<Ao 10 T7A 

183 


512 


BL5005X 


It— Trrrvt'Pin OJimTTifl ciiVinnit i^rrtfJlf* 
VJ~piULClll galiiUla oUUUillL piUiliC 


"RT ^AA^R 07 O'i 7 /10/1a AO 1A <Q 


513 


BL00524 


Somatomedin B domain proteins. 


BL00524A 9.65 8.925e-14 80-101 


515 


RT 00041 


oav.>icriai roguiaiory proicins, ora^ lainiiy 
proteins. 


HT AAA/1 1 0'3 OO 1 0^/1a 1 O /lOO ^0/1 






rissj 1 ciiN ZjJUN \^-r IIN wHiv IVLC i j\Li- 
HINDI. 


PT^AAA^>C 10 QO 0 CA/Ia 7 0 OAI ^A^I 

rJJUUUOO iJ.yz o.3UUe-l3 jyi-404 


517 


BL0041 5 




"DT AAidl 4 CO O OOl A AO O^O 

996 


518 


PR00109 


TYROSINE KINASE CATALYTIC 


PR00109B 12.27 9.471e-12 126- 


519 


BL00290 


Immmioglpbulins and major 
iiibioL(jiiipd,uDiiiLy complex proieiiis. 


BL00290B 13.17 4.750e-09 47-65 


522 


PR00505 


D12 CLASS N6 ADENINE-SPECIFIC 
SIGNATURE 


PR00505A 14.15 7.128e-09 364- 

•201 


525 


BL00312 


Glycophorin A proteins. 


BL00312B 9.22 5.781e-10 891- 
920 


528 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.500e-32 16-55 


529 


PR00254 


NICOTTNIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 


PR00254D 15.50 4.000e- 17 131- 
150 PR00254A 11.23 4.706e-14 
61-78 PR00254C11.36 4.000e-12 
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IF) 

NO: 


ACCESSION 

NO 


DESCRIPTION 


RESULTS* 








113-126 PR00254B 12.97 1.486e- 
11 95-110 


531 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 femily sign. 


BL00741B 14.27 6.870e-16 787- 
.810 






SIGNATURE 


rissjKJiyiu 14.JO j.l4je-34 447- 
476 PR00193C 12.60 lAyi^-yi 

O 1 <C '^A A DD Am O^D 1 1 /TA 1 n Cf\^ 

zlo-Z'W JrKUUiyjD )J.o9 7.730e- 
29 167-193 PR00193A 15.41 


533 


PD02870 


RECEPTOR rNTERLEUKIN-1 
PRECURSOR 


PD02870B 18.83 5.596e-09 348- 

JO 1 


535 


PR00683 


SPECTRIN PLECKSTRIN 
HOMOLOGY DOMAIN SIGNATURE 


PR00683D 15.87 2.452e-10 465- 
484 


536 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.684e-24 164-207 


538 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 2.739e-09 225- 
237 




DT t\f\Af\£. 


Actins proteins. 


BL00406C 6.75 l.OOOe-40 157- 
212 BL00406B 5.47 6.143e-37 

AA lyic TIT f\f\Al\cr\ 11 CO A £r\r\^ 

90-143 BL00406U 12.58 4.600e- 
36 291-346 BL00406E 8.44 

Z.ZUue-jJ J04-414 IJUJU4U0A 

9.95 4.441e-23 7-42 




Jrrs.UU'f jO 


T?TiaOCnA/f A T 1>T?/^TI7TXT DO 
i\iD\Jo\JNu\L, rK-CmilN rz 

SIGNATURE 


DDAA/fCiTC? 1 t\C A iCOC« lA AA CA 


J*f i 


PP (\C\A^fi 


KitSlJoUMAL. rivU IcUN rZ 

SIGNATURE 


DDAA/IC^TT ^ AiC A iCIC 1A /I jI CA 

rK0U45ob 3.06 9.625e-10 44-59 




r r UUUZ J 


Ank repeat protems. 


DCAAAT5 A \C AO T OCT^ 1 1 l'>0 

154 


544 


PF00642 


Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). 


PF00642 1 1.59 9.082e-10 838-849 


546 


BL00383 


T3Tosine specific protein phosphatases 
proteins. 


BL00383E 10.35 4.1 15e-10 104- 
115 


547 


BL01226 


Hydroxymethylglutaiyl-coenzyme A 
synthase proteins. 


BL01226A 13.79 l.OOOe-40 50-89 
BL01226C 13.51 l.OOOe-40 127- 
167 BL01226D 11.60 l.OOOe-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 
.l.OOOe-40 386-434 BL01226I 
25.06 l.OOOe-40 460-508 

DT AlOOiC/— ' T C TiC "i AO"^^ 1AO 

JdL01z2ou 15.76 3.4836-32 292- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 




ljl_/UU70n 


Syndecans proteins. 


l5LUUyo4B 12.05 z.42oe-10 1246- 

1 9CO 


551 


DM01930 


2 kw FINGER SMCX SMCY 
I L/ivuyo w . 


DM01930E 15.41 1.367e-37 170- 
zij UMUiyjUr 14.10 o.zjze-zo 
267-303 DM0 1930B 19.86 
9.1636-10 37-71 


552 


BL0O195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e-09 9-29 






Tyrosine specific protein phosphatases 
proteins. 


BL00383E 10.35 2.756e-12 436- 
447 . 


555 






PPnnAftlTJ 19 TO 7 ^1 9ia 1 1 199 

137 PR00403A 16.82 3.912e-10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 


558 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 2.714e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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TAT? f /"'"O TTOT I ^T M 


RESULTS* 








297 PR00380C 13.18 5.154e-20 
226-245 PR00380B 12.64 9.400e- 


559 . 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 5.333e-09 522-531 


001 




rKU 1 HI JN AMllN Ur lir 1 HJAoJl 

PRECURSORilYDROLASE SIGNA. 


PD0I7951J 11.56 2.333e-12 159- 

172 PD01795A 10.27 l.OGOe-09 
1 ^ < 1/1/1 


'562 


PD01795 


PROTEIN AMINOPEPTIDASE 

rKHUUKoUK ilXUKULAoil oHjNA. 


PD01795B 11.56 2.333e-12 110- 
lli rUOi /yDA 10.27 l.OOOe-09 
86-95 




JDJ-rUUU 1 o 


EF'hsnd calcium-binding domain 
proteins. 


jDJL.uuuis /.4i i.jyie-uy 41-34 


565 


BL00348 


p53 tumor antigen proteins. 


BL00348F 23.19 4.143e-09 188- 
231 


567 


PD00301 


PROTEIN REPEAT MUSCLE 

OAT /"TI "WK T5T 


PD00301B 5.49 4.1 15e-09 284- 
295 


569 


PF00850 


Histone deacetylase family. 


PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 1.519e-16 722- 
746 PF00850F 15.70 1.118e-ll 
794-827 FrOOSSOG 22.75 8.375e- 
li oJj-6 /j 


570 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 

riSJDO I IN A. 


PD00289 9.97 4.960e-10 137-151 


571 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8.800e-ll 44-53 


573 


BL00299 


Ubiquitin domain proteins. 


BL00299 28.84 1.123e-ll 123-175 


574 


PF01140 


Matrix protein (MA), pl5. 


PF01140D 15.54 3.700e-10 986- 
1021 


576 


BL00284 


Serpins proteins. 


BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.9 13e- 18 
71-95 BL00284B 17.99 7.261e-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284fc 19.15 
7.429e-12 387-412 


579 


PD01066 


PROTEIN ZINC FINGER ZINC- 

T-TXT/^CD HjTCT'AT TDTXTrMXT/"' XTT T 


PD01066 19.43 6.553e-29 15-54 


580 


BL50001 


.Src homology 2 (SHZ) domain proteins 
profile. 


BL50001B 17.40 4.500e-12 1010- 

lOj 1 


581 


PD00930 


PROTEIN GTPASE DOMAIN 

AUllVAllUJN. 


PD00930B 33.72 3.189e-22 608- 
649 FDOOySOA 25.62 6.805e-17 




RT nnfi 19 


_ : — : : 

Osteonectin domain protems. 


126 


585 


DM01551 . 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 8.859e-10 102- 
122 


586 


PF00628 


PHD-fiinger. 


PF00628 15.84 3.4556-12 235-250 


JO/ 


TIT nnno'7 


'Homeobox' domain proteins. 


BL00027 26.43 6.063e-10 85-128 


588 


PR00326 


GTPl/OBG GTP-BINDING PROTEIN 

■D A XyfTT CTi^XT A TTTrDIT 


PR00326A 8.75 7.525e-16227- 
248 PR00326C 9.79 6.760e-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 15.74 
9.229e-13 248-267 


589 


RT no422 


VJiCUULu LllUlCllld. 


2378 


590 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e-10 295- 
339 


591 


BL00128 


Alpha-lactalbumin / lysozyme C proteins. 


BL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e-ll 110- 
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RESULTS* 








132 . 


596 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 3.136e-09 31-46 


597 


DMG0547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 

■ . - • ■ 


DM00547C 17.30 1.667e-19207- 
229 DM00547E 13.94 6.200e-18 
319-342 DM00547B 11.28 
1 OOOe-17 179-193 DM00547D 
11.60 9.2506-13289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM0b547A 12.38 4.818e-ll 
158-170 


600 


PD01066 


PROTEIN ZmC FINGER ZINC- 
FINGER METAL-BINDING MJ. 


PD01066 19.43 1.882e-27 13-52 


601 


BL00192 


Cytochrome b/b6 heme-ligand proteins. . 


BL00192A 11.90 6.400e-09 390- 
430 ' 


602 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 118- 

157- 


603 


BL00936 


Ribosomal protein L35 proteins. 


BL00936B 27.27 8.615e-09 1 18- 
157 - 


606 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 1 1.36 7.300e-10 292- 
306 PROOO 1 9A 1 1 . 19 5 .667e-09 
323-337 


607 


PRG0019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 


608 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.74 2.853e-10 
60-75 PR00320A 16.74 4.706e-10 
14-29 PR00320C 13 01 5 320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16,74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 


. 610 


BL00750 


Chaperonins TCP- 1 proteins. 


BL00750B 16.17 l.OOOe-40 70- 
120 BL00750A 20.076.2116-37 
26-69 BL00750G 20.12 8.800e-31 
431-471 BL00750F 18.40 5.125e- 
30 370-411 BL00750E 24.59 
8 650e-29 295-332 BLO075OH 
21 M l.OOOe-27 489-524 
BL00750C 25.65 5.345e- 17 149- 
181 BL00750D 16.16 6.3186-14 
203-222 


613 


BL00766 


Tetrahydrofolate ■ 
dehydrogenase/cyclohydrolase proteins. 


■BL00766B 24.49 l.OOOe-40 142- 
1 90 BT 00766F H 7S 1 Onnp-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.5366-26 283-313 BL00766A 
21.48 6.0636-24102-132 


615 


BL00256 


Adipokinetic hormone family proteins. 


BL00256 12.28 3.298e-10 746-755 


616 


BL00319 


Amyloidogenic glycoprotein extracellular 
domain proteins. . 


BL00319C 17.12 9.053e-09.419- 
453 


617 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 143 9 4.429e-09 44-63 


618 


BL00030 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030A 14.39 4.429e-09 44-63 


620 


BL00325 


Actin-depolymerizing proteins. 


BL00325B 21.66 5.817e-16 77- 
123 


622 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 


BL00972A 11.93 5.5006-19 213- 
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JJli<a\-ivJLr 1 xxjlS 


KboULTS* 






















laixiiiy z proicins. 










501-526 BL00972B 9.45 l.OOOe- 








1 1 '507-'3m RT nnoTir' t^c /is? 
11 j.yi-jKji DiJKjKjy 1 lO.Ho 








"i. 1 11 KJ< pt ArtmoTT 
3.l0ve-ll J /U-Joj JoUJUy/2Ji 








on 70 O <T7a ia <o< CAO 




rUK) I UOO 


JrJtvUliiUN Z/JJNC rllNOiiK 


rUUlUOo iy.43 o.ijie-3y 0-45 






CTMniTTJ XyrnXAT PTKTTMXT/^ VTT T 








T^Th A T^Kr«v ciiK^omiKf A HTP ^0T\an/1on^ 

LfniJ\u~DOA suDlainiiy /iir-ucpcDucm 


oJLUUUoyu /i.o/ /./DUe-jj 4/5- 






helicases proteins. 


>l "DT AAA"? A A 1 O J <l ^ Aft A r 

524 BL00039A 18.44 2.000e-25 
















1 C OOT OCT PT AAAQAD 1 A 1 A 
















PTiHAm^^A 1 A OA 7 AAAa lO OOO 










0^ J 






DT^AAO A/C A 1 A OiC T AAA™ 1 1 'I AA 

rUt)030oA lOJZo 7.0006-12 290- 






ppp/TTPC^^p PI7 


0 AA 

304 






5*-nucleotidase proteins. 


Ttt ft/XT 0.ry~^ ft i ^ 1 ^ -1 /- 1 A\ r> 

BL00785C 9.45 3.625e-16 108- 








1 OO . PT AAOO CP 1 C O C A AAA«. 1 £ 








nA OAC T5T AA^Ot A A TO ^ 

279-295 BL00785A 9.73 6.500e- 






■ 


14 zy-4U oL>UU/oC)£5 10. Dj 








C CAAn 1 O 0£ TiT AA'TOCTX A OA 

5.500e-13 72-60 BL00785D 9.89 










OJO 




PAYTI T TM QinXTATTTPP 


PPAACSOP 1 A A1 a OA1 a 1 A OC 

rK0Uo.5zJb 14.44 9.9016-14 ht>- 








lUo 


Oj / 


rKuuiuy 


•T^VPOCrivTC If^TXT A Ot: A TA T VTT/*' 

i Y KUoUNxl IsJJ^ A2>il K^A I AL> I J 


DT5 A A 1 AA'D 1 O T7 O ZTO™ 1 O '^O 1 

rKOU109B 12.27 6.362e-13 221- - 








.240 


QJ O 


pFnnA'3 ^ 


MSP (Major sperm protein) domain 


PPAAAKP 1 ^ Q/ A QAAa 1 1 A<CO 

rrUUOpDxJ tj.o4 4.yOUe-l 1 4d3- 






protems. 


C Al 

j02 






■ \7DP nrCPP A T*!? HyrCT A T T /^TTLTT/^XrCTXT 


T>T>AAO^ATl T A^ 1 AAA— 1 O Aft 

rROOooOB 7.04 1.900e- 18. 85-99 






CinXT A TT TP P 


TiTy f\f\0£f\f^ A ^1 1 A1 A ^ 1 il -AA t ftA 

rROOooOC 9.61 1.474e-14 99-109 








TmAAOiCAA C AC 1 T^A-, 1 A 

rKOOooOA 5.46 1.720e-14 63-76 




rUUUOoo 


i'KOibLN ZINC-rlNGER METAL- 


.PD00066 13.92 4.462e-15 271-284. 








T\TNftAft^^ 1 ft*^ j4 J 1 /" rtft 1 ^ 

PD00066 13.92 4.462e- 15 299-312 








T1T\AAA^^ 1 ^ A*^ 1 DAA— 1 >l *> jl A 

PDOOOoo 13.92 2.800e-14 327-340 








TfcTNftftft^^ 1 AO nftft 1 ^ o rt** '»fty 

PDOOOoo 13.92 2.800e- 14 383-396 








I^T^ftftftZ"^ 1 *> A*^ Oftft 1 ^ jl 1 1 Af^ A 

PD00066 13.92 2.800e-14 41 1-424 








PDOOOoo 13.92 7.000e-14 355-368 








T^T^ft ft ft ^ ^lOftOOrtftft ^ J * A y-\ J j» ^ 

PDOOOoo 13.92 8. 800e-14 439-452 








^\TNftftft^^ 1 o fto o nftft ^ J J J-* 

PDOOOoo 13.92 8. 800e-14 495-508 




- 




T^TNftftft^^ TOft'>-1 ^ftft ^or^l F ^ A 

PD00066 13.92 1.500e-13 551-564 








TiTNftftft^^ 1 O ft»^ T Aftft 1 .1 yr J ftft 

PD00066 13.92 7.000e-13 467-480 








r*T%ftftft^^ 1 O ft»S T rtftft I** ^ r ^ 

PDOOO60 13.92 7.p00e-13 523-536 








TlT^AAAiTiC 10 AO A CAA« 1^ 01C AOO 

rUOUOoo 13.92 9.5006-13 215-228 








PT\AAArfC/C 1 1 00 0 CAAa IIA^'J 

rUOUUOD 13.92 9.!>00e-i3 243-22)0 








PT~\AAAiCiC 1 1 OO O CAAa 1 O CTA CAO 

rUOOOoo 13.92 9.;)00e-13 579-592 








PT^AAAA/? 1 1 00 (I <1 ^A 1 A /^AO iCOA 
rUwOOO Ij.yz o.Olje-lU OU/-02U 








xuuuwoo ij.7Z 1 .ouue-uy loz-zuu 






iviDosomai protein ozoe protems. 


PT AAOiC 1 P 1 1 OA T A1A« 

Dl/UOyolD 11.24 /AZy&-ij o/- 








1 AA T5T AAQAl A O OA A AOOa O/; 

luu jjjjUuyoiA y.yu 4.u/ye-zo 








d.0 




PT JinsR^ 

DLAJyJDoj 


ixiDosomai proxem oD proiems. 


PT AACO< A OO AO 1 QAl^ >1A 1AO 

oLOUDojA 2o.43 1.3916-40 103- 














.- 


1 01 01A 

iy3-23U 


647 




1 rp-/vsp ^ wj-'^ rcpcdi protems proiems. 


RT A A Q fiO 0 /I AAa 1 A 1 5 1 1 OO 


648 


PR00876 


NEMATODE METALLOTHTONErN 


PR00876C 6 15 9 229e-09 1 12- 






SIGNATUKE 


126 ■ ■ 


652 


PD01066 


PROTEIN ZINC FINGER ZINC- 


PD01066 19.43 5.941e-27 29-68 






FINGER METAL-BINDING NU. 




653 


BL00047 


Histone H4 proteins. 


BL00047A 13.53 l.OOOe-40 2-41 
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SEQ 

rrk 
LU 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








t5LUUU4/r> o.ji 1.4zye-40 41-74 
Dl_fUUU4/^ IZ.lo l.JlUc-ofi /4- 

104 




xLJxf IVOO 


pcnTPTM 7TKIP PTMrSPP VTMP 
rxvU 1 lillN ZrfiiN^.^ rllNvJJtlK. ^jUSK^~ 

FINGER METAL-BINDING MJ. 


rjjuiuoo iy.4j H.juye-zj iu-oy 


655 


BL01115 


GTP-binding nuclear protein ran proteins. 


BLOl 1 15A 1022 3.4S3e-17 19-63 


657 


BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 8286e-10 31-40 






Serine/threonine specific protein 
phosphatases proteins. 


BL00125B 21.48 l.OOOe-40 89- 
135 BL00125C 19.97 l.OOOe-40 
153-200 BL00125D33.il l.OOOe- 
40 213-268 BL00125A 14.83 
.8.941e-38 47-84 


. 659 


PD00066 


PROTEIN ZmC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 492-505 
PD00066 13.92 9.308e-15.380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e-I3 240-253 
PD00066 13.92 7.500e-13268-281 
PD00066 13.92 7.500e-13 408-421 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 l.OOOe-10 436-449 


660 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.189e-26 29-68 


661 ^ 


BL00795 . 


Involucrin proteins. 


BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e- 13 
187-232 BL00795C 17.06 5.014e- . 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
17.06 7.896e-12 191-236 
BL00795C 17.06 1.667e-M 185- 

230 BL00795C 17.06 2.000e- 11 
198-243 BL00795C 17.06 3.7786- 
11 171-216 BL00795C 17.06 
6.11]e-ll 197-242 BL00795C. 

■ 17.06 6.444e-ll 194-239 
BL00795C 17.06 8.000e-ll 189- 
234. BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-10 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- . 

23 1 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09 200-245 BL00795C 17.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 
DL00795C 17.06 6.600e-09 201- 
246 BL00795C 17.06 .6.600e-09 
i02-247 BLOOTySC 17.06 6.600e- 
09 208-253 






Nucleoside diphosphate kinases proteins. 


Dl AA>I^A Tl 1 AAA« >l A 1 AC\ '^Av! 

DL00469 22.22 l.OOOe-40 149-204 


663 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.41 le-11 331- 

•3 OC - ' 

355 


664 


BL00601 


family) proteins. 


BLOO6OIB 20.92 3.631e-13 69-98 


665 


BL00082 


Extradiol ring-cleavage dioxygenases' 
proteins. 


-BL00082A 19.07 8.615e-12 49-72 


666 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 


DM01537B 21.63 4.073e-37 834- 
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SEQ 
ID 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






HELICASE. 


881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-l 8 698-718 DM01537A 
15.14 6.766e-12 1537-1557 


00/ 


JJMU 133/ 


KW sKLiW bK12 NULJLEOLAK. 
HELICASE. 


DMOISjTB 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01537A 15.14 
o.ojUe-1 o Do4- /U4 iJM01537A 
ID.IH D./OOe-lZ 1jz3-1j4j 


ovy 


RT nmn7 


. — Y~- A TP u- T- '■ 

Protein kinases ATP-bmuing region 
proteins. 


rJJL/UUlU/A 16. J? 0./ooe-z4 o4y- 

880 BL00107B 13.31 6.727e-13 
916-932 






Ubic^uitin domain proteins. 


"RT nmoo 9a<q tj^** oo 
uuuuzyy z6.o4 y. /jDe-z / o /-ay 


671 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 6.57 le- 12 432-475 


676 


PR00861 . 


ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 


PR00861E 9.88 2.385e-09 206- 
221 


678 


BL00225 


Crystallins beta and gamma 'Greek key' 
motif proteins. - 


BL00225B 18.06 7.517e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8.200e-]9 1698- 
1733 BL00225B 18.06 4.808e-14 
73-108 BL00225B 18.064.808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-21 12 BL00225A 
13.82 5.8296-122043-2064 
J5L00225A 13.S2 3.127e-09 1759- 
1780 


0 /y 




u-rKUltlN rSJbl A WJJ-4U KJbrfcAJ 
SIGNATURE 


rKU0320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e-10 
169-184 


680 


BL00243 


Integrins beta chain cysteine-rich domain 
proteins. 


BL00243 1 3 1 .77 1 . 1 43e- 1 r 1 72- 
215 


681 


PR00852 


XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 


PR00852H 5.90 l.OOOe-29 612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 1L38 8.875e- 
27 309-331 PR00852B 11.08 
2.S00e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 

'D'D AAO COT 1 I OC c AAA_ ^TA 

FK00o52r 1 1 .65 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 


682 


BL50058 


G-protein gamma subunit profile. 


BL50058 27.23 1.375e-35 15-63 




jtsLuuy /2 


Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 


BL00972A 1 1 .93 7.500e-20 40-58 
BL00972D 22.55 3.903e-16 300- 

TIC T3T AAATIO A vl C 1 AAA— 11 

325 OL00972B 9.45 l.OOOe-13 
120-130 BL00972E 20.72 5.500e- 
11 325-347 


687 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 4.273e-14 98- 
138 


056 


. BL00388 


Proteasome A-type subunits proteins. 


BL00388A 23.14 l.OOOe-40 8-54 

DL\}\jiooD J 1 ..So 3.oC)4e-j J 00- . 

108 BL0038SD 20.71 l.OOOe-21 
153-184 BL00388C18.79 8.147e- 
16126-148 


689 


PD02796 


PROTEIN STEROL CARRIER LIPID- 


PD02796B 20.92 1.105e-15 347- 
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SEQ 
ED 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






IRAN. 


394 


691 


PD01572 


PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 


PD01572 8:77 4.083e-09 1-31 


692 


BL00028 


Zinc finger, C2H2 type, domain proteins. 


BL00028 16.07 7.600e-10 488-505 


694 


BL01013 


OwrfPrrtl— hinHincT TiTrttPin fiami]\/ 
v^Aj^^i&iui uujuiii^ piuicui xaixmy 

proteins. - 


563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6.211e- 
14 615-625 BL01013B 11.33 


695 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 3.571e-13 164-178 
PD00289 9.97 8.650e-ll 2147- 

37 


698 


PR00161 


NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROhffi SIGNATURE 


PR00161C 9.51 4.930e-09282- 
302 


700 


. PR00749 


LYSOZYME G SIGNATURE . 


PR00749F 13.63 8.636e-13 139- 

173-194 PR00749B 16.54 1.419e- 
1 1 48-70 PR00749C 7.26 3.060e- 
11 72-91 PR00749A 10.33 
4.8156-10 24-45 


703 


PR00704 


CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 


PR007041 9.52 l.OOOe-29 476-505 
PR00704D 11.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 
1.237e-21 317-339 -PR06704H 

PR00704A 14.68 2.125e-19 27-51 
rssssM 11. oo 1 .<i J / e- 1 / yo- 
113 PR00704B 17.94 1.833e- 15 
72-95 


705 


PR00859 


PROKARYOTE METALLOTHIONEIN 
SIGNATURE 


PR00859C 7.06 2.776e-09 94-1 1 1 


706 




XULClIllcUlaLC lliolilCJlLd yrULCilJa. 


416 BL00226B 23.86 3.250e-24 
21 268-299 BL00226A 12.77 


707 


PR00021 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A4.3I 2.440e-]0 2-15 


708 


BL00361 


Ribosomal protein S 1 0 proteins. 


BL00361B 18.34 5.101e-10 82- ■ 
105 


. 709 . 


PR0002I 


SMALL PROLINE-RICH PROTEIN 
SIGNATURE 


PR00021A 4.31 2.200e-102-15 


710 


BL00514 


Fibrinogen beta and gamma chains C- 

tPTTTiinnl Hrtmnin nTotPinc 

L&lllliliCU UUUlaUJ piULClIld> 


BL00514C 17.41 8.412e-27 160- 

107 RT nn^ljlT? 14 75! 9 OnOo 1 A 

219-236 BL00514H 14.95 1.551e- 

15'^17-'?47 RT fin';i4n 1 "> 08 

7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 


711 


PD00930 


PROTEIN GTPASE DOMAIN 


PD00930B 33.72 8.714e-12 49-90 


714 


BL00400 


LBP / BPI / CETP family proteins. 


BL00400C 24 53 6 029e-17 158- 
202 .BL00400D 23.26 2.080e-14 
222-259 BL00400A 21.59 1.600e- 
1027-59 . 


715 


BL01154 


RN A polymerases L / 1 3 to 1 6 Kd 


BL01154B 24.55 5.500e-36 40-76 
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SEQ 
Tn 

LLf 

NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






subunits proteins. 


BL01154A 18.70 3.000e-22 19-40 


/ 10 




FINGER METAL-BINDING NU. 


PU01055 19.4j 9.786e-32 10-49 


■717 
III 




Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.206e-14 77- 
102 BL002 ISA 15.82 8.412e-I0 
175-200 


719 


. BL00309 


Vertebrate galactoside-binding lectin 
proteins. ■ 


BL00309C 18.652.2416-09 62-87 


726 


BL00687 


Aldehyde dehydrogenases glutamic acid 
proteins. 


BL00687E 25.37 7.136e-33 266- 
316 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.54 3.647e- 
26 39-81 BL00687C 24.13 
0.0876-22 96-133 BL00687F 9.55 
2.5006-1 1 352-363 


111 
IJ. / 




KW xKAlNoC-Kir lAaJfc xUlVJlKaJb 11 

0RF2. 


T%"N if A IOC y(XT TO 1T1 AAA_ jtAI'lA 

IJMU1354N 13.17 l.OOOe-40 129- 
174 DM0I354O 8.73 6.605e-I5 
180-226 


734 


PD00301 - 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 


PD00301A 10.24 6.400e-09 101- 
112 


735 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024A 10.261.0006-4022-69 
BL0I024B8;9I l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe-40 146- 
185 BL01024D 13.22 l.OOOe-40 
185-222 BL01024E 11.96 l.OOOe- 
40 222-266 BL01024F 9.42 

I. OOOe-40 266-317 BL0I024G 

II. 09 1.0006-40 317-349 
BL01024H 13.88 l.OOOe-40 389- 
442 


736 


PF0O913 


Trypanosorae variant surface 
glycoprotein. 


PF00913D 11.90 7.1306-10 24-51 


737 


PR00700 


PROTEIN TYROSINE PHOSPHATASE 
SIGNATURE 


PR00700D 12.47 2J200e-09 82- . 
101 




rKUUjzU 


tj-FKU 1 blN BE 1 A WD-40 REPEAT 
SIGNATURE 


PR00320C 13.01 1.6006-0968-83 
PR00320A 16.74 7.366e-09 68-83 




rxtUUs / J 


UJSA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 


PR00871G 14.48 8.000e-09 178- 
201 . 


745 


BL00518 


Zinc fmger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 2.286e-10 33-42 


749 


BL00215 

, ■■ ■ ■ 


Mitochondrial energy transfer proteins; 


BL00215A 15.82 5.2006-15 221- 
246 BL00215A 15.82 7.6186-14 
20-45 BL00215A 15.82 8.851e-ll 
123-148 BL00215B 10.44 9.526e- 
11 69-82 BL00215B 10.44 
7.3006-09 272-285 BL00215B 
10.44 8.5006-09 165-178 


7^1 


"DT <nA/^o 


ore nomoiogy j (.'^riij Qomam proteins 
profile. 


BL50002A 14.19 l.OOOe-14 370- 
389 BL50002B 15.182.2006-10 
4ub-422 






rtiYLLji/^ proteins. 


BHJ0353B 11.47 3.0896-12 390- 
440 


753 


. PF00622 


Domain in SPia and the RYanodine 
Receptor. 


PF00622B. 2 1.00 4.2 14e- 1 4 47-69 


754 




/vD\^ uaiisponcre lainiiy proicins. 


TIT AAO 11A nTOC 0/1 1a 1 a jC/C TO 

l3L,UUZl lA Iz.zJ 6.y416-lU OO-Zo 


755 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.075.9356-17 
253-274 PR00926D 10.53 2.0596- 
15 301-320 -PR00926E 1 1.70 ■ 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.971e-15 344-363 PR00926B 
16.07 9.526e-l 3 210-225 
PR00926A 10.41 1.514e-12 197- 
211 


756 . 


BL01187 


Calcium-binding EGF-like domain 


BL01187A 9.98 2;125e-12 324- 
■^36 RF 01187A 9 QR 4 7R0P-1 1 

377-389 BL01187B 12.04 3.057e- 
10 439-455 . 


757 


PF00651 


BTB (also known as BR-C/Ttk) domain 
nroteins 


PF00651 15.00 4.429e-10 43-56 


758 


PR00055 


mv TAT DOMAIN SIGNATURE 


PR00055A 8.13 8.855e-09 144- 
■156 


.759 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 5.304e-ll 110-123 


760 


PR 0044!? 




37-57 PR00448B 16.01 9.379e-21 
100-118 PR00448r 11 46 1 OOOp- 
20 129-147 


765 


BL01042 


Homoserine dehydrogenase proteins. 


BL01042A 1329 5.909e-ll 74-95 


766 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625A 12.84 2.1546-18 26-46 
PR00625B 13.48 9.000e- 16 57-78 


768 


BL00762 


WHEP-TRS domain proteins, 


BL00762A 23.43 8.500e-28 112- 

6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762b 11.15 9.667e- 
09 210-220 - 


769 


PR00709 


AVIDIN SIGNATURE 


PR00709A4.60 1.934e-09 1-20 


. 770 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 


PR00320C 13.01 1.720e-10 262- 

262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 

S "^OOp-OQ 769-977 PROO'^OnA 

16.74 6.268e-09 55-70 


771 


PR00019 


SIGNATURE 


PROOni QR 1 1 'Ifi R 714p.19 R7. 

101 PR00019A 11.19 l.OOOe-10 
90-104 


772 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C8.91 6.308e-10 110- 
159 


773 


PD02807 


APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 


PD02807C 8.91 6.308e-10 155- 

204. 


774 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 

1 81 Rp-l R '^1 R ^^9 r)Vf0n';4.7P 

17.30 3.53 le-17 546-568 
DM00547A 12.38 1.273e-ll 497- 

622-636 ' 


776 


PR00779 


INOSITOL 1,4,5-TIUSPHOSPHATE- 
RrhrntNG PROTFTM RFPFPTOR . 
SIGNATURE 


PR00779F 14.51 5.147e-09 769- 

709 


777 . 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 
765 


778 


PR00779 


INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 


PR00779F 14.51 5.147e-09 742- 

765; 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


779 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.543e-09 6-45 


781 


PR00205 


CADHERIN SIGNATITRF 


ri\\J\J^\JDD I i.jy J.lloc-Ll 034- 

672 PR00205B 11.39 8.5886-11 
230-248 PR00205B 11.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 


783 


BL00625 


Regulator of chromosome condensation 
CRCCl) proteins. 


BL00625B 17.69 2.167e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.S856- 

ID lH\r-l /H DLAtVOZjD I i .Oy 

2.770e-16 245-279 BL00625A 

BL00625A 16.21 6.507e-14 146- 
175 


785 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 595-607 

PF00f)!!4R 0 4S d An(\p 00 fiSfi fifiS 
f rvuuo*TXD y,Hj DjO-ODo 


786 


PF00084 


Sushi domain proteins (SCR repeat 
oroteiTi*;- 


PF00084B 9.45 7.188e-10 595-607 

pPnnn5M.*R o f\ AC\(\f> no fiiA^^ 
rr\j\j\jo^D y.HD D.*tUwo-uy ODO'DOC 


787 


BL00826 


MARCKS family proteins. 


BL00826C 7.63 6.738e-09 203- 
230 


788 


PR00453 


VON WIT LFRRA^m FACTOR TYPP 
A DOMAIN SIGNATURE 


PR00453B 14.65 8.568e-10 75-90 


789 


PROOIC 


ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 


rjsXJV i\JZ.D Ih^.oZ J .m oO-\Jy yOj" 

977 


790 


BL00030 . 


Eukaryotic RNA-binding region RNP-1 
proteins. 


BL00030B 7.03 5.500e-l 1 199- 
209 


791 


BL00415 




437 BL00415N 4.29 2.1 17e-09 

09 97-141 BL00415N4.29 
5.664e-09 387-431 


795 


PD01066 


PROTEIN ZmC FINGER ZINC- 
FENGER METAL-BINDTNG NT I 


PD01066 19.43 2.091e-36 105-144 


799 


PF00731 


AIR carboxylase. 


PF00731C 23.16 7.333e-35 337- 

■^Rfl PPnn7^ m 1 0 /17 7 il70a 7Q 

299-336 PF00731A.19.32 6.333e- 
24 268-297 


804 


BL00170 


^jv/iupiiiixii"ij'pc pcpLiUj'l'piolyl L/ib LTcillb 

isomerase signatur. 


RT nm 7^15 on 07 Q A71 a AO Om 

337 


805 


BL00678 




BL00678 9.67 5.800e-10 418-429 


806 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 7.5716-14 290- 
318 


. 807 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 9.100e-09 451- 
466 • 


809 


BL00107 


Protein kinases ATP-binding region 
proteins.' 


BL00107A 18.39 4.4626-12 564- 


810 


PR00453 


VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 


PR00453A 12.79 1.310e-14 36-54 


814 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.047e-31 16-55 


815 


PD01066 


PROTFTN ZTMC FTNfiFR /FMP 
FINGER METAL-BINDING NU. 


pr\m n/C*^ io /t^ o f\Aia. 'x\ \c cc 
X^JJUIUOO VjAi /.04/e-Jl IOtjD 


817 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.1546-36 125- 
154 PR00193E 19.47 3.9196-18 
179-208 


818 


PR00830 


ENDOPEPl'lDASE LA (LON) SERINE 


PR00830A 8.41 9.571e-ll 115- 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEASE (S16) SIGNATURE 


135 


819 


BL00126 


3'5'-cyclic nucleotide phosphodiesterases 
proteins. 


BL00126C 22.07 7.857e-24 528- 
569 BL00126E 35.22 3.714e-15 
669-724 BL00126D 25.50 1.173e- 
14 584-623 BL00126B 15.20 
l.OOOe- 12 502-514 BL00126A 
27.56 3.361e-09 461-498 


820 


PR00511 


TEKTIN SIGNATURE 


PR00511B 1225 8.826e-22 174- 
195 PR00511A 13.59 7.723e-ir 
155-172 


821 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 2.800e-15 13-36 


822 


PF00780 


Domain found in NIKl -like kinases, 
mouse citron and yeast ROM. 


PF007801 14.69 4.825e-09 231- 
261 


827 


BL00030 


Eukaryotic RNA-binding region RNP-1 
protems. 


BL00030A 14.39 5235e-ll 144- 
163 


828 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 9.357e-ll 545- 

586 . . 


829 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A 9.37 l.OOOe-40 46-85 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 11.33 9.000e- 
30 235-261 PD02448F14 22 
9.654e-25 279-303 PD02448D - 
11 48 3 659e-18 197-211 
PD02448G 10.73 7.857e-16 305- 
318 


830 


BL00720 


Guanine-nucleotide dissociation 
stimulators CDC25 family sign. . - 


BL00720B 16.57 4.500e-23-483- 
507 


831 


BL00107 


Protein kinases ATP-binding region 
proteins. . 


BL00107A 18.39 6.625e-21 143- 
174 BL00107B 13.31 4J214e-10 
213-229 


832 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 5.787e-l] 32-57 


833, 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497A 6.92 4.375e-09 41-59 


834 


BL00229 


Tau and MAP proteins tubulin-binding 
domain proteins. 


BL00229A 23.57 9.565e-10 99- 
138 


- 835 


BL00421 


Transmembrane 4 family proteins. 


BL00421E 20.97 2.216e-09 1053- 
1083 


836 


BL00795 


Involucrin proteins. 


BL00795B 12.41 7.93 le-09 405- 
445 


837 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 l.OOOe- 17 34-53 
PR00020B 15 52 5 846e- 16 68-85 
PR00020D 12.70 2.543e-15 147- 
162 PR00020C 13.66 3.483ie-13 
95-107 PR00020E8.64 6.586e-13 
165-179 


838 


BL50017 


Death domain proteins profile. 


BL50017B 17.60 6.897e-13 1499- 
1515 


839 


PF00850 


Histone deacetylase family. 


PF00850C 14.55 9.542e-09 1352- 
1369 


840 


PF00023 


Ank repeat proteins. 


PF00023A 16 03 4 500e-17 44-<?n 
PF00023B 142 0 7.923e-ll 73-83 
PF00023B 14.20 9.000e-10 139- 
149 PF00023B 1420 5.500e-09 . 
40-50 


842 


BL01194 


Ribosomal protein L15e proteins. 


BL01194B 13.66 l.OOOe-40 37-85 
BL01194C 12.35 9250e-40 103- 
138 BL01194A 18.70 7.632e-38 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








2-37 BL01194D 19.02 2.658e-36 
139-178 


843 


BL00610 


Sodiummeurotransmitter symporter 
femily proteins. 


BL00610A 17.73 1 .OOOe-40 40-90 . 
BL006 1 OB 23 .65 1 .OOOe-40 1 04- 
154 BL00610C 12.94 l.OOOe-40 
206-258 BL00610E 20.341. OOOe- 
40 355-398 BL00610F 29.02 
1 .OOOe-40 454-509 BL006 1 OD 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 


845 


BL00143 


Insulinase family, zinc-binding region 
proteins. 


BL00143A 20.91 4.300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 141-156 


846 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE 


PR00543D 10.87 1.355e-09 898- 
914 


847 


PR00543 


OESTROGEN RECEPTOR 
SIGNATURE . ■ 


PR00543D 10.87 1.355e-09 898- 
914 


848 


BL00824 


Elongation factor 1 beta^eta'/de]ta chain 
proteins. 


BL00824C 14.58 l.OOOe-40 129- 
167 BL0d824D 14 04 6 192e-39 
\in-1^ BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 


849 


PD01066 


PROTEIN ZINC FINGER ZINC- . 
FINGER METAL-BINDING NU. 


PD01066 19.43 l.OOOe-40 12-51 


850. 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. • 


PD01066 19.43 7.316e-24 10-49 


852 


BL01272 


Glucokinase regulatory protein family 
proteins. 


BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.314e-25 
249-274 BL01272A6.49 1231e- 
18 99-117 


853 


.PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.341e-20 65- . 
106 


854 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 


PD00289 9.97 6.850e-ll 140-154 


858 


PR00450 
• 


RECOVERIN FAMILY SIGNATURE 


PR00450C 12.22 3250e-25 68-90 
PR00450B 11.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.581e-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 
4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 


860 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 7.188e-27 74-117 


866 


BL00477 


Alpha-2-macroglobulin family thiolester 
region proteins. 


BL00477L 23.51 7.480e-20 54-87 


867 • 


BL01078 


Molybdenum cofactor biosynthesis 
proteins. 


BL01078B 14.20 1.621e-20408- 
429 BL01078A 10.16 2.000e-13 
.366-379 BL01078D5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.793e-H 501-513 


868 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20 64 5 800e-24 462- 
489 BLOl 177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 441-459 


869 


BL01177 


Anaphylatoxin domain proteins. 


BLOl 177E 20.64 5.800e-24 415- 
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ID 
NO: 
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NO. 


DESCRIPTION 


RESULTS* 








442 BL01177C 17.39 5.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 394412 


871 


BL50007 


Phosphatidylinositol-specific 
phospholipase X-box domain proteins 
prof. 


BL50007A 19.61 l.OOOe-40 322- 
J06 J3L50UU7U 19.54 l.uOOe-40 
589-631 BL500G7B 20.90 6.700e- 

9.053e-33 748-785 BL500b7C 


872 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 


BL00972D 22.55 3.250e-17 90- 
1 1 s 


874 


PR00452 




PRnfM';9Ti 1 1 fis 4 7';np no ■?7n_ 
386 


377 


R1 00741 


uuclujluc xiuuicviiuc UlajUUiaiiUii 

stimulators CDC24 family sign. 


1366 


878 


DMOO? 1 S 






881 


PD09807 


APDT rPnPROTFTMF PRPPTTR^SDR 

APaE GLYCOPROTEIN PLAS. 


P'nnoRmP t n on a thoa no "xkq 
407 


882 


.PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.188e-37 8-47 


885 


PF00023 


Ank repeat proteins. 


PF00023A .16.03 8.071e-09 10-26 


886 


PR00372 


BIOPTERIN-DEPENDENT 
AROMATIC AMINO ACID 
HYDROXYLASE SIGNATURE 


PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 
134-154 PR00372E 12.62 2.125e- 

Zi JoU-j)bU rKUUi /zO I.W 

3.025e-22 289-309 PR00372F 
13.09 6.333e-21 395-414 

348 


oo / 




\j ix^-uinuiiLg.eiungaiioii laciors proieins. 


TiJ f\ri'ir\ \ p OA no o Qnf\a o/i i f\i 
135 BL00301A 12.41 4.316e-13 . 


.888 


BL00518 


Zinc finger, C3HC4 type (RING finger), 


BL005 18 12.23 1 .667e-09 30-39 


889 


. PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 4.906e-26 6-45 


890 


. DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 7.652e-09 113- 
123 






PTR2 family proton/oligopeptide 
symporters proteins. 


BL0102ZB 22.19 D.016e-14 72- 
118 BL01022E23.51 1.173e-12 

4/z-jUo dJLUIU/zA 1 I.JO y.liDC- 

12 42-61 BL01022D 9.42 3.455e- 


893 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


894 


PD02407 


3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 


PD02407K 12.59 6.529e-10 360- 
383 


895 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237B 13.50 9.100e-14 116- 
138 PR00237F 13.57 1.360e-13 
312-337 PR00237G 19.63 9.069e- 
13 jDi-iSO rR00237b 13.03 
7.120e-12 243-267 PR00237D 
8.94 4.150e-ll 194-216 

108 


896 


BL00129 


Glycosyl hydrolases family 31 proteins. 


BL00129D 16.76 8.258e-26 634- 
678 BL00129A2621 1.720e-25 
384-430 BL00129E 22.60 4.857e- 
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ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.891e-18 495-522 
BL00129F 26.19 7.545e-15 814- 
852 


897 


BL00598 


Chromo domain proteins. 


BL00598 14.45 1.220e-13 9-31 


898 


. BL00518 


Zinc finger, C3HC4 type (RING finger), 
proteins. 


BL00518 12.23 6.000e-09 396-405 


899 


PDOllOl 


INHIBITOR HEAVY CHAIN 
CHANNEL IN.. 


PDOIIOIB 21.53 l.OOOe-40 274- 
327 PDOllO ID 24.45 l.OOOe-40 
457-512 PDOIIOIA 18.25 6.268e- 
23 83-117 PDOIIOIC 12.69 
1.2376-16366-386 PDOllOlE 
6.73 7.750e-12 566-576 


900 


. PR00600 


PROTEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 


PR00600A 11.61 5.979e-09 31-52 


901 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 8.1 16e-31 24-63 


903 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10;22 1.509e-l 1 21-65 


906 


DM00215 


PROLENE-RICH PROTEIN 3. 


DM00215 19.43 2.174e-13 539- 
572 DM00215 19.43 4.750e-12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e- 1 0 548-581 DM00215 
19.43 4.054e-l6 550-583 
DM00215 19.43 5.339e-10 552- 
585 DM00215 19.43 7.107e-10 
544-577 


907 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 6.276e-12 314- 
332 


908 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1125- 
1156 


909 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 5.950e-17 1118- 
1149 


910 


BL00107 


Protein kinases ATP-bindiog region 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


911 


BL00107 


Protein kinases ATP-binding region . 
proteins. 


BL00107A 18.39 8.560e-13 150- 
181 


912 


PF00856 


SET domain proteins. 


PF00856A 26.14 4.553e-l 1 243- 
280 


913 


PF00628 


PHD-finger. 


PF00628 15.84 6.400e- 13 197-212 


914 


PR00962 


LETHAL(2) GLWT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26296-319 PRG0962A 13.28 
6.143e-2215-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.3? 9.769e-21 552-572 
PR00962H 13.32 2.636e-20 623- 
643 PR00962I 11.68 9.786e-20 
692-712 PR00962E 8.81 2^91 5e- 
18 515-534 


915 


PR00962 


LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962D 10.40 l.OOOe-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C 8.00 4.000e- 
21 278-299 PR00962F 12.39 . 
9.769e-21 482-502 PR00962H 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








13.32 2.636e-20 553-573 
ris\j\}y\j£*i. I i>oo 7./ouC-Zu OZZ" 
642 PR00962E 8.81 2.915e-18 
445-464 




x5JL.uUl o4 . 


Serine protcEses, trypsin femily^ histidiiie 
. proteins. 


107 






jLiivi uoniain proicms. 


RT nflA7RR 1/1 70 S 'XQ'Xp 17 711 
79fi RT 0047RR M 70 fi 719p in 

271-286 . 




PR 00040 


SIGNATURE 


988 


,922 


BL00150 


Acylphosphatase proteins. 


BL00150 25.33 l.OOOe-40 37-84 


924 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 8.063e-09 79- 
113 




rSJLUUU/Z 


Acyl-CoA dehydrogenases proteins. 


RT flfifl70T^ A9 *> CST^i Oyl 
I3JL.UUU /ZJJ /e-Z4 ZSU- 

33 1 BL00072E 24.12 8.200e-24 
368-41 1 BL00072C 25.30 7.873e- 
20 226-267 BL00072B 9.48 
^ndop-17 iR^.iQ/; 


927 


BL00237 


G-protein coupled receptors proteins. 


BL00237C 13.19 1.692e-13 229- 
256 BL00237A 27.68 6.657e-13 . 

QA_1 m "RT OOTJTT^ 1 1 O a 

13 290-307 


098 




Globins profile. 


BL01033B 13.81 l.OOOe-15 93- 
105 


929 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 8.714e-13 203- 
253 




UL.UU4 1 J 


Synapsuis proteins. 


397 BL00415N 4.29 2.1176-09 
63-107 BL00415N 4.29 3.6286-09 
57-101 BL00415N4.29 5.664e-09 
347-391 


933 


PD02448 


TRANSCRIPTION PROTEIN DNA- 
BINDIN. 


PD02448A9.37 l.OOOe-40 46-85 
PD02448B 10.17 l.OOOe-40 85- 
133 PD02448C 13.62 l.OOOe-40 
152-189 PD02448E 11.33 9.000e- 

9.654e-25 267-291 PD02448D 

PD02448G 10.73 7.857e-16 293- " 


934 


DM00191 


w SPAC8A4.04C RESISTANCE 


DM00191D 13.94 9.083e-10 136- 
17'; 


935 


BL01115 


GTP-binding nuclear protein ran proteins. 


BL01115A 10.22 4.696e- 10 67- 
111 ■ 


936 


BL00019 


Actinin-type actin-binding domain 
proteins. 


BL00019D 15.33 8.138e-14 865- 
895 


937 


PR00762 


CHLORIDE CHANNEL SIGNATURE 


PR00762A 14.22 4.0006-22 183- 
201 PR00762C 9.29 1.0006-21 
268-288 PR00762E 12.07 3.250e- 
.20 520-537 PR00762D 1129 
l.uuue-iy 4/i>-4yi rKUU/ozr 

.1^17 1 il7Qia 10 ^ISi 

PR00762B 12 12 1 818e-18 214- 
234 PR00762G 14.13 3.455e-17 
577-592 


938 


BL00027 


'Homeobox' domain proteins. 


BL00027 26.43 9.5006-25 291-334 


939 


DM01111 


4 kw PHOSPHATASE 


DMOllllE 17.28 1.568e-10248- 
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ID 
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ACCESSION 
NO. ' 


DESCRIPTION 


RESULTS* 






TRANSFORMING 61K PDFl. 


297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01 11 ID 16.76 

10.67 8.674e-09 91 1-935 


940 


BLOOl 07 


Pmtpin tinfl^pc ATTP-hmHino rAtymn 

proteins. 


Rr oni fiTR 1 ^ ^ ) 1 ooq 

JjLi\J\jL\JiD IJ.Jl l.l/Uv/C-lH ^yj- 

309 BL00107A 18.39 6.760e-I3 
229-260 


942 


BL01160 - 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.832e-ll 543- 
597 


943 


PD01066 


PROTFIN 7TMr FTNGFR VYtiC- 
FINGER METAL-BINDING NU. 




945 


BL00989 


proteins. 


RT nnoROR "yd ^i i nnnp_4.n 

117 BL00989A 11.66 l.OOOe-13 
5-19 


946 


PR00178 


FATTY ACTD-BINDING PROTEIN 
SIGNATURE 


469 


947 


BL00178 


Aini'nftSiPvl-trjiTiC'frT A ^ivnthptncpQ 

dass-I proteins. 


RT nni78R 7 114 R57p-ft0 711 

724 


948 


PF00628 


PHD-finger. 


PF00628 15.84 8.4126-14 201-216 


951 




ougai u oiiapun pruLcjjib. 




952 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 4.300e-H 26-49 
PR00926F 17.75 6.348e-09 134- 
157 


955 








. 957 


PR00069 


ALDO-I<:ETO REDUCTASE 


PR00069A .1 6.0 1 8.826e-24 26-51 

rtS\)\}\}OyXj X i.jj 1.31*1^6-1 / 00- 

105 PR00069C 16.03 8.816e-14 

155-171 


.958 


PF00583 


Acetyltransferase (GNAT) family. 


PF00583A 12.53 5.500e-10 631- 

649 


961 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 

kJAN_IX X wXVl_> 


PR00328A 10.62 8.740e-10 7-31 ; 


962 


BL00354 


HMG-I and HMG-Y DNA-bindins 
domain proteins (A+T-hook). 


RT ftni54A 1 SI 0 418p-in 14R0. 

1499 


963 


BL00354 


HMG-I and HMG-Y DNA-hindino 

domaia proteins (A+T-hook). 


RT nni54A 1 SI 0 418f».lft 14R0 

1499 


964 


BL00027 






965 


PF00992 


Troponin. 


PF00992A 16.67 2.421e-09 58.1- 

0 1 o - 


966 


PR00515 


5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 


PR00515D7.91 5.741e-09 13-33 


967 


BL00579 


XVlL^UaUlllal piUlCill 1.^7 pj\JLCJLlia> 


194 


.970 


BL00504 


dehydrogenase FAD-binding site 
proteins. 


BL00504D 10.43 7.261e-21 75-93 


973 


PF0058P . 


UvrD/REP helicase. 


PF00580A 13.37 4.720e-09 249- 
971 


974 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456F5.86 l.OOOe- 10 242-254 


975 




|5 — TVT'rtt'AiTl ^rtitT\l^rt i"Of^or\trtT*e' r\**/^+oinf« 

\j~piut.ciii uuuuicu jcucpiurb proiciDb. 


RT Anon A OO fiSI A A'^Qa OO OO 

139 


976 


BL00031 


Nuclear hormones receptors DNA- 
binding region proteins. 


BL00031A 19.55 7.158e-33 60-93 
BL0003 IB 22 25 5 500e-98 94- 
126 


977 


PD00066 . 


PROTEIN ZINC-FINGER METAL- 
BINDI. - 


PD00066 13.92 8.200e-I6 196-209 
PDb0066 13.92 8.200e-16 336-349 
'PD00066 13.92 2.385e-15 476-489 



184 



wo 01/57190 
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SEQ 
ID 
NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PD00066 13.92 9.3086-15 252-265 
PD00P66 13.922.8006-14448-461 
PD00066 13.924.6006-14 392-405 
rJJUUUoo li.yZ j.zUUe-14 280-293 
PD00066 13.924.0006-13 224-237 
PD00066 13.924.4296-12 308-321 
PD00066 13.92 9.5716-12 420-433 
PD00a66 13.92 6.8706-11 168-181 


978 


BL00721 


Fonnate-tetrahydrofolate ligase proteins. 


BL00721B 13.21 l.OOOe-40 346- 
401 BL00721D 13^90 l.OOOe-40 
. 538-592 BL00721E 13.46 i.OOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
21.20 8.2396-39763-814 
DL00721A 15.31 9.719e-32 287- 
321 BL00721C 16.92 4.000e-30 
4^6- jjj DJLU0/21r IS.yo 6.232e- 
27 660-702 BL00721G 7.97 
3.0176-10 721-734 


981 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 2.552e-09 180- 
201 


982 


BL00869 


Renal dipeptidase proteins. 


BL00S69C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
.16 219-242 BL00S69G 13.55 
2.5436-16 192-214 BL00869F 
12.77 7.03 le-14 157-192 . 
rsLOOboyl 12.92 3.2746-12 242- 
270 BL00869D 14.02 5.282erl0 
95-124 BL00869B 15.55 9.3S2e- 
10 31-61 


983 


PR00196 


ANNEXIN FAMILY SIGNATURE 


PR00196F 13.89 2.125e-09 92-108 


984 


BL00485 


Adenosine and AMP deaminase proteins. 


BL00485D 30.82 2.427e-10 154- 
209 



* Results include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 



TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


2 


ig 


Immunoglobulin domain 


3.9e-17 


60.3 


3 


HSP90 


Hsp90 protein 


0 


1548.4 


6 


tsp_l 


Thrombospondin type 1 domain 


0.002 


22.1 


7 


7tm_l 


7 transmembrane receptor (ihodopsin 
family) 


6.7e-08 


27.3 


9 


PWWP 


PWWP domain 


8.16-16 


66.0 


12 


Clq 


Clq domain 


1.7e-26 


101.5 


13 


Clq 


Clq domain 


2e-20 . 


81.3 


14 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.7e-42 


153.9 


15 


E1-E2 ATPase 


E1-E2 ATPase 


6.3e-124 


412.2 


16 


trypsin 


Trypsin 


1.2e-87 


278.6 


17 


'g 


Immunoglobulin domain 


7.6e-12 


43.2 


18 


lectin c 


Lectin C-type domain 


0.0003 


21.2 


20 


Alpha_L_fucos 


Alpha-L-fucosidase 


1.2e-217 


736.5 



185 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


PFAM NAME 


DESCRIPTION 

■ 


p-value 


PFAM 
SCORE 


LL 


pkinase 


Eukaryotic protem kinase domain 


3.3e-87 


OAT 1 

303.1 


23 


pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296. b 




pkinase 


Eukaryotic protein kinase domain 


2.7e-85 


296.6 


25 


ank 


Ank repeat 


5.5e-14 


59.9 


27 


pkinase 


Eukaryotic protein kinase domain 


1.5e-100 


347.4 


2& 


spectrin 


Spectrin repeat . 


4e-57 


203.2 


29 


spectrin 


Spectrin repeat 


4e-57 


203.2 


30 


WD40 


WD domam, G-beta repeat 


1.2e-07 


38.8 


33 


mn 


RNA recognition motif. 


l.le-17 


72.2 


34 


irm 


RNA recognition motif. 


l.le-17 


72.2 


36 


7tm_l 

.. .. 


7 transmembrane receptor (liiodopsin 
famDy) 


3e-36 


117.3 


J / 


ank 


Ank repeat 


5.9e-25 


'96.3 


38 


SRF-TF 


SRF-tj'pe transcription factor 


1.4e-36 


133.9 


40 


alk_phosphatase 


Alkaline phosphatase 


0 


1034.9 


44 


ZI-C2H2 


Zinc finger, C2H2 type 


8.6e-103 


354.9 


45 


sugar tr 


Sugar (and other) transporter 


3.1e-08 


40.3 


47 


7tm_2 


7 transmembrane receptor (Secretin 
family) 


6.4e-79 


275.6 


50 


zf-e2H2 


Zinc finger, C2H2 type 


1.3e-98 


341.0 


51 


filament 


Intermediate filament proteins 


1.2e-176 


600.3 


52 


zf-C3HC4 


Zinc fmger, C3HC4 type (RING 
finger) 


2.7e-10 


37.7 


53 


Cadherin_C_ter 
m 


Cadherin. cytoplasmic region 


1.9e-94 


327.2 


54 


S_100 


S-lOO/ICaBP type calcium binding 
domain 


5.2e-18 


73.3 


58 


inositol P 


Inositol monophosphatase family 


5e-13 


49.8 


59 


7tin_l 


7 transmembrane receptor (rhodopsin 
family) 


8.8e-46 


147.6 


60 


Kunitz_BPTI 


Kunitz/Bovine pancreatic trypsin 
inhibit© 


3.7e-47 


148.6 


02 


JJAL> 


UAU lamuy 


2.5e-74 


260.3 


63 


MOZ SAS 


MOZ/SAS family 


5.9e-133 


455.1 


64 


MOZ SAS 


MOZ/SAS family 


1.7e-123 


423.6 


65 


ras 


Ras femily 


9.3e-89 


308.3 


67 


T "I - .1 1*1 

Hamlp_like 


Haml family 


3.7e-49 


176.7 


68 


7tin_l 


7 transmembrane receptor (rhodopsin 
lamily) 


5.2e-39 


126.1 


/O 


ZI-C2H2 


^mc rmger, C2H2 type 


1.5e-112 


387.3 


71 


Peptidase_M41 


Peptidase family M41 


1.2e-110 


381.0 


72 


abhydrolase 


alpha/beta hydrolase fold 


9.8e-05 


26.5 


0 1 

81 


K tetra 


K+ channel tetramerisation domain 


0.022 


-16.8 


82 


pkinase 


Eukaryotic protein kinase domain 


5e-49 


176.3 


M 


AAA 

AAA 


ATPases associated with various 
cellular act 


1.3e-77 


271.3 


Q< 
OJ 


homeobox 


Homeobox domain 


1 .4e-28 


108.3 


O / 


Hjr-oeta 


Transforming growtii factor beta like 


6.7e-68 


210.2 


y 1 


mito carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 




adenylatekinase 


Adenylate kinase 


l.le-15 


60.0 




'g 


Immunoglobulin domain 


4.1e-20 


69.8 


OO 

yy 




L-Nri aomam 


3.4e-120 


412.7 


JUU 


homeobox 


Homeobox domain 


7.4e-32 


119.3 


1 m 


Zi-C2ii2 


Zmc finger, C2H2 type 


2.2e-47 


170.8 


102 


£.1 \^^rxz. 


Zinc finger, C2H2 type 


4.4e-csy 




103 


dynamin 


Dynamin family 


1.4e-150 


513.6 


104 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


105 


lectin c 


Lectin C-type domain 


4.2e-15 


63.6 


108 


metalthio 


Metallothionein 


2e-25 


97.9 



186 



wo 01/57190 



PCT/DSO 1/04098 



SEQID 
INUl 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


112 


HSP20 


Hsp20/alpha crystallin family 


2.6e-20 


77.7 


1 ID 




Elongation factor TS 


3.8e-63 


221.1 


1 

1 10 


sugar tr 


Sugar (and other) transporter 


4e-63 


223.1 


1 1 Q 


catElase 


Cstalase 


0 


1 158.9 




TTP14 


Ubicjuitin carboxyl'tennmal 
nyuroioSCj louiu 


1 « 1 A 

le-10 


24.4 


199 


XDCullLllIO 


ivxciaiiouuoiicul 


z.oe-zD 


y 1 A 


125 


adh short 


short chain dehydrogenase 


1.6e-45 


164.6 


ISO 


JtvKA-D 


KJKAJtS tJOX 


7.9e-25 


95.9 


iJ.1 


G- alpha 


G-protein alpha subunit 


le-249 


843.0 




tnito can" 


Mitochondrial carrier proteins 


2e-65 


227.2 


131 


EFIBD 


EF-1 guanine nucleotide exchange 
domain 


4.9e-53 


189.6 


ioz 


ri\ro 


UYr uomain 


4.9e-28 


106.6 


1 ^"5 


uir 


uir uomaui 


4.9e-28 . 


106.6 


134 


lipocalin 


Lipocalin / cytosolic fatty-acid 
binding pr 


2.1e-33 


119.1 


135 


pkinase 


Eukaryotic protein kinase domain 


3,3e-86 


299.8 


130 


ank 


Ank repeat 


2.2e-29 


111.1 


13 / 


TT O 


Small cytokines 
(intecrine/chemokine), inter 


3.1e-18 


65.2 




Tj = . _ 

pyndoxal_deC 


Pyridoxal-dependent decarboxylase 
conse 


0.00011 


19.0 


IhU 


ju — ^ — ' 


Cadherin domain 


].3e-88 


307.8 


142 


efhand 


EF hand 


5.7e-33 


123.0 


1 /I "2 

143 


Acyltransferase 


Acyltransferase 


2e-29 . 


] 11.2 


140 


cytochroine_c 


Cytochrome c 


1.7e-33 


124.7 


14/ 


pkinase 


Eukaryotic protein kinase domain 


2.3e-86 • 


300.3 


148 


PDZ 


PDZ domain (Also known as DHR or 
OLGF). 


1.7e-09 


45.0 


1 Ad 


aldo ket red 


Aldo/keto reductase family 


7.4e-189 


640.8 


150 


homeobox 


Homeobox domain 


3.2e-08 


38.7 


1 j1 


PseudoU_synth 
1 


tRNA pseudouridine synthase 


4.7e-57 


203.0 




abhydrolsse 


— u n. — — i T-r\ 

alpha/beta hydrolase fold 


1.7e-31 


1 18.0 


I 


pr»7 


rijLj uomam (,Aiso Known as DrlK or 


1 1 Aft 

l.le-09 


45.6 


1 JU 


PTTTfc 

rJtUJ 


rJHiJ-nnger 


7.6e-15 


62.8 


157 


fe3 


Fibronectin type DI domain 


0.015 


21.9 


IDo 


honieobox 


Homeobox domain 


2.7e-27 


104.1 


iou 


"own 
rWl 


PWl domam 


3.9e-24 


93.6 


iOZ 


unaj 


DnaJ domain 


2e-06 


34.8 


164 . 


Cbl_N 


CBL proto-oncogene N-terminal 
domain 


8e-117 


401.5 


166 


metalthio 


Metallothionein , 


3.1e-26 


100.6 


10 / 




Leucine Rich Repeat 


0.00069 


26.3 


169 


fibrin6gen_C 


Fibrinogen beta and gamma chains, 
C-term 


5.3e-180 


611.4 


'1 it\ 


. 

nbnnogen_C 


Fibrinogen beta and gamma chains, 
v^-terro 


5.3e-180 


611.4 




uDrinogen 


Fibrinogen beta and gamma chains. 


le-149 


510.8 


1 ID 


noincooox 


Homeobox domain 


l.je-29 


1 1 1.O 


1 ^4 

1 /H 


r X VJt 


r I vt zmc Unger 


7.4e-28 


103.8 


175 


GRP 


vjx>jur uV/lllaUl 


•3 Op AR 




182 


pkinase 


Eukaryotic protein kinase domain 


3.4e-71 


250.0 


185 


CAP GLY 


CAP-Gly domain 


5.6e-51 


182.8 


186 


TBC 


I'iJC domain 


2.2e-50 


180.8 


187 


TBC 


I'BC domain 


2.2e-50 


180.8 



187 



wo 01/57190 



PCT/USOl/04098 



SEQID 

i\U: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


1 Clf 
1 oo 




r uomain ^Atso Known as lyniv or 


fe- 1 J 


-> /.u 


1 RO 

1 07 


Kp\rh 


P^CIUU lUULlI 


S 9p-1 OA 






1 i UpUiiiyUolil 


1 1 UjJUlUjrUoUlo 


•3 8f._m 








RiVcVp r7Fp-7^1 Hnmnin 


V/.UU 1 o 




100 


•g 


iiiiiimim^unjuiiii uuiUaul 




1 

DO. 1 


202 


EGF 


EGF-like domain 


3.4e-54 


193.5 




trefoil 


Trefoil (P-type) domain 


ie--d4 






113 


ijdc Qomain 








cZIlaDu 


r,r nana 


u.uuyo 






loiv ^flannel 


Slow voltage-gated potassium 


U.UU J 1 


O.i 




, .. 

treioil 


irciUii vi Lypc^ uumain 


7 0#»_4SI 


1 / J. / 


700 


]?ihn<:nTnal ^1 


Prhncnmnl nrntpin Ql'^/^IR 


1 .Zc- / o 


074 7 


710 


Jl ICiJlUpCAJii 


Uatti rn^ov fn 
ncill UpC AXli 


1 .JC-OZ 


771 ^ 
ZZl.D 


213 


TBC 


TBC domain 


2.5e-48 


174.0 




B35IC 






1 oo c 




INJUOSOUlal LiZh 


rv^W mOIu 


C Oa 


DO O 
07.Z 


222 


fii3 


Fibronectin type III domain 


7.3e-141 


481.4 




coiuin_A-Ur 


Cofilin/tropomyo sin-type actin- 
binding pr 


9.3e-47 


168.8 


224 


efhand 


EE hand 


6.1e-06 


33.2 




Pteriii_4a 


Pterin 4 alpha carbinolamine 
dehydratase 


9.3e-42 


152.1 




ABC tran 


AdC transporter 


4.1e-l 10 


379.2 


Z34 


tl Uerrz JJerr 
2 


El family 


3.7e-90 


312.9 




t l_Uerr/_Uerr 
2 


fal ramily 


1.6e-48 


174.6 




riYU'zz dauom 


JrMr-zz/bMr/MrzU/Ciauain lamily 


1 Off 

1.7e-25 


98.1 




Op io ds__n europe 
P 


Vertebrate endogenous opioids . 
neurope 


1.8e-159 . , 


543.2 




cir-Da 


Eukaryotic initiation factor 5A 

ll 1 0 1 o 


j.ye-iiw 


■3 CO O 

JDO.O 


740 




Fidvin conioiniiig amine oxiuase 


9 1 1 


■29 C 






7\r\r ■finfTAr ^^OUO H/r*A 

ZrfinL linger, ^ziix type 


9 1 o 00 




244 


Band 7 


SPFH domain / Band 7 family 


2.3e-53 


190.7 




3iik 


Ank repeat 


i.oe-oo 


"300 < 


74A 




z^mc linger, y^Zriz. type 


0. /e-4y 


1 oc n 
175.9 


247 


actin 


Actin 


2.3e-42 


140.3 


Z'to 


ER_luin,6ii_^rccep 
I 


ER lumen protein retaining receptor 


2-4e-155 


529.5 




r ivLrZx - L^iauulJu 


I'Mr-Zz/tMJr/MJrzU/Uiauam lamiiy 


Z.Ze-iiS 






collagen . 


^oiiagen inpie neiix repeat \ 


1 /la 1 a 


JO.O 


255 






n n^7 


7 R 
/.o 


257 


CAP GLY 


CAP-Gly domain 


1.4e-20 


81.8 


9 fin 

Z.UU 


Wl-'n-U - 


WD domain, G-beta repeat 


y.ye-oz 


0 1 o c 


9fi1 


WJL'T-U 


WD domain, G-beta repeat 


O 0<i /CO 

y.ye-oz 


0 1 o c 






wu uumain, o-ucia repeal 


O Oa jC9 


O 1 0 ^ 

zl O.J 


961 


mfilm A TIT? 


ooniin/uopomyosin-iype acnn- 

ImtiHitid r\r 


9 Oja 9 1 


oo 


264 


U ihrtcrtmul T 14 
ixiUUdUllicll 1-1 


isjDUbumdi pruLcui Lfinp/ijZoe 


o 9o 1 n 
y.ze-iu 


hU.O 


265 




oapuSm /\.-iypc QOmalu 


y1 /l„ 09 

4.4e-z / 




266 


SAPA 


Saoosin A-tvoe domaiii 


4.4€-27 


103.4 


267 


ABC_tran 


ABC transporter 


9.5e-39 


142.2 


269 


Ribosomal L14 


Ribosomal protein L14p/L23e 


6.2e-62 


219.2 


270 


abhydrolase 


alpha/beta hydrolase fold 


0.042 


-3.3 


272 


ras 


Ras family 


4.3e-87 


302.8 



188 



wo 01/57190 



PCT/USO 1/04098 



SEQ ID 

NO* 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


273 


ITTTl 


RNA recognition motif. 


0.074 


14.6 


Z / J 




l^ipOl^cLllIl / cyiOSOllC lauy-aClQ 
UiJiUili^ pi 


z.je-41 


140. 'I- 






TJqc ftiTmiv 
JVito KUillljr 


1 1 f'.fn 
i . I c-o / 


7'^R 


977 


yj I 


T Ir*^rtllli^T> r"OP^\rt\'A/l_4'OT*TViir»ol 

LrDit^uiuxi i*dr DOxy i-icj iiiixioi 


1 .ze-j*f/ 


Q 


278 


START 


^TART rfnmam 

O J. i^JV i. UUUUaiU 


■2 7a_no 


/I /I 1 

•+*T. 1 


279 


WD40 


W/l 1 Hnmnin Kptn rAnpat 


1 Rp_77 


1 Od 7 


282 


G-natch 




7 Rp-79 


R^ n 


287 




"RTfil familv 




J J 1 .u 


289 


KRAB 


KRAB box 


7 1p-71 


R7 R • 


293 


7tm 3 


1 trail cm PTM timnp rpf*pntrtr 




7S^^ f% 


295 


SET 


Hnmain 




1 n 7 


296 


Pyridox_oxidase 


Pyridoxamine 5'-phosphate oxidase 


1.3e-76 


268.0 


907 


IT J 11 


jN-iN/A. rcLiugiiiupii muLii. 




1 ^^7 0 


90 R 


UUIC liiCliiy lUall 


uDiEtf v-'Wis^j nicujyiiiaiisicraSc lainuy 


A ^ A 
o.je-uj 


0/; 1 


299 


Ubie_methyltran 


ubiE/C0Q5 methyltransferase family 


0.0024 


-118.1 




v^yi rcauciaSc 


FAD/NAD-binding Cytochrome 


7 7i* <1 

/./e-01 




302 


G-patch 


G-patch domain - - 


3.1e-14 


60.7 


■^07 
j\J 1 


7fm 1 


7 transmembrane receptor (rhodopsin • 
lainiiy^ ■ - 


1 . /e-4j 






x STx. 


r^n Qumain 


A Aft! ^ 


17 9 


^ 1 u 


7tm 1 

/Llll 1 


/ uonbiiicinuranc rcccpior ^rnuoopsin 

iQlliXiy J 


1 .H-e-o*!- 


77A Q 


31 1 


JXJIUUCUJwdC 


JXJLlUUcUlC^C ili^C UUlIlaill 




77A 7 


119 


tiihiilin 

lUL/UlUi 


1 uuuiiii/i^ loZrf iduiiiy 


4 Op 751/^ 


yOj.O 


314 


SURF4 


^TTRF4 familv 


1 7 p 1 QO 


o /o,o 


325 


IMS 


imnR/TTiiipR"/camR familv/ 


7p_SR 


707 ^ 


327 


\/<IUll&l ill 


r^aHhprin Hnmnin 
V.^ClUlld jjLl UUiJLLalU 


4 ^p-01 
*f 1 


J 1 D.v 


329 


NAC 




7 1P.-751 


1A7 R 


330 


TP tran<! 


Pnncr^na+iHvliTirtci'frtl iT^ncfpr TXTrttPtn 
X iiUopiiavluyilllUalLUl LrdJIoiCI piuLClJi 




J JO. J 


332 


TFIIS 


Tran^crintion fartnr ^-TT /"TFTT^^ 


O.OC— u-J 


7Q 


337 


zf-C2H2 

£j1 \_'^1 IXr 


7\nr finapr tvnp 
i-iU.L\-' iiii^cij V .An/. ty|jc 






340 


AIRS 


ATR Qvnt}ia<;p rplatpjl nrntpin 


4p-'^7 


170 7 


343 


aLUlwAlll 


Atitipviti 




970 4 


346 


Stathmin 

\J ITIlllllllll 


^t^flimiT\ familv 

OLuUlilllii LdlLlliy 


1 Rp-on 

1 .oc-yv 




347 




RiVirvcnmal TrrritpiTi T lA 






348 


lactamase B 


Metallo-beta-lactamase superfemily 


0.012 


-6.0 




ClliOilU 


Sj^r llallil 


7 ^A 14 


/^l A 


J J J 




x^coLiii v^-iypc uoiDain 


1 ^A 


Q7 1 


354 


WD40 


WD domain, G-beta repeat 


2.2e-18 


74.5 






Lipocalin / cytosolic fatty-acid 

' iMTiHino' Tvr 
UlUuUig pi 


o.je-iu 


J O.J 


362 


. A r^tvltrancf 


A f^PtvltrancfpTacf* ^^tXTAT^ ^miK/ 
/Y^CLyiuaiiaicxaDC y_vji\/\l j Idillliy 


n no 1 0 


94 0 


365 




tR'WA cvnfhptac#»c r»lQcc T /T T \A anA 
IXVIN/T. DyULllCLaaCd Wood 1 i-i, IVi cUlU 


■*f.oc-ioj 


fi9R 9 


366 


fcJH IJlCi K40 w 


Siilfata*ip 


1P-99R 


770 (\ 


368 


START 


START Hnm ain 

X iVlV 1 UVJlllCllll 


Rp-1 1 

JJ.OC" 1 1 




369 


nlrinase 

Ul\JJlCiOw 


Kliifarvntic Tirfttpin IriViacp HAmain 

X^LUVCUy UUV^ piULClU KUlCuC UUlllalil 




H L.J 


370 


ACBP 


ApvI OaA VlinHino nrAtpin 




100 7 


371 




H'llt'flTVAti/* rtrAfpin Irinncp HAmnin 
j_/LiA.axytjLio piviCiJi &.UiaoC uUiUaUl 




■^97 S 


373 


EGF 


i^NJi. lliVC UUJLLlaU.1 


9 fip-19 


'id % 


375 




£^uiL> tiiigcr, \^zxiz type 


C 9p 


99'^ d 


377 


KRAB 


KRAB box 


3.7e-27 


103.7 


379 


SET 


SET domain 


7.3e-61. 


215.6 


380 


Glyco_transf_8 


Glycosyl transferase family 8 


0.0028 


-40.1 


381 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-06 


33.7 


383 Glyco traiisf_8 


Glycosyl transferase family 8 


0.0028 


^0.1 
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CI?f> TT> 

NO: 






p-value 


PFAM 


384 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


385 


TBC 


TBC domain 


0.017 


-66.6 


389 


Glycos transf 2 


Glycosyl transferases 


1.3e-15 


65.3 


390 


Na Ca Ex 


Sodium/calcium exchanger protein 


3.9e-105 


362.7 


391 


fii3 


Fibronectin tvne ITT dnmaiTi 


4 1 p. 1 f19 




392 


&3 


Fibronectin type HI domain 


4p-4^ 




393 


fe3 


Fibronectin type JH domain 


4p-d^ 


xOj.D 


394 


Idl recept b 


Low-den*5itv linonrotein rprpntnr - 
repeat 




17^ S 


395 


Ribosomal L30 


Ribosomal protein L30p/L7e 


0.0023 


16.0 


396 


Oxysterol BP 


Oxysterol-binding protein 


1 .5e-94 


327.5 


397 


RDS_R0M1 


Peripherin/rom-1 


2.9e-33 


123.9 


399 


lactamase B 


^4etallo-heta-lact3Tna<ip ^nnprfamilv 






402 


F-box 


F-bbx domain 


0 0007 


7R 1 


403 


CLP_protease 


GId orotease 


4 Rp.fv4 
t.oc— U*T 


77fi 7 


405 


Ribosomal L35 
Ae 


R.iho*iomfll nrntfin T ^SAp 






406 


LIM 


LIM don) sin containing proteins 


0 0009 1 


70 7 


410 


tKNA-synt Ic 


tRNA svnthetase*! cla<:«: T CF and Ci'\ 


lp-7'^fi 


700 R 


411 


HlV transf 2 


Nucleotidvltran<;fera<;e domain 


^ Qp-lf^ 


^7 0 
o/.v 


412 


DEAD 


DEAD/DEAH box helicase 


V.WU 11/ 


17 7 


414 


DUF94 


Domain of unknown fimction DTTFQ4 


0 oon 1 1 

U.vV/U i I 


7fi 0 


415 


tubulin 


Tubulin/FtsZ family 




QTi, 7 
y 1 J.I 


420 


SET 


SET domain 

vX>^ A uV/XXluXU 


^p-S7 




421 


WD40 


WD domain, G-beta repeat 


6.1e-29 


109.6 


423 


zf-C2H2 


Zinc fin per C2H2 tvne 


1 ^p ^0 


1 /I /I o 


424 


pkinasc 


H.llKaTVotiP Timtpin Vinncp HnmnJn 

JL/Uivai jrULlL^ LJlVJlCUi IVllia^C UUIJIOJU 


o.ye- /J 


-iol.iS 


428 


LIM 


T .TK4 dr*nnnin pj^ntainino nrrttpinc 


1 Rp 


izo. / 


431 ^ 


kazal 


KflzaJ-tvnp Qprinp nmtpncp inhiViifnr 

domain 




/J.O 


432 


SH2 


Src homoloffv domain 7 




1 OR A 


433 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-144 


407 7 


434 


ras 


Ras family 


0.012 


~ X\J\J.O 


436 


E1-E2 ATPase 


E1-E2 ATPase 


1.6e-117 


J7 1 ,\J 


437 


RNAjDol A 


RNA polymerase alpha subunit 


0 


1 A77 7 


438 


PHD 


PHD-finger 


1.6e-ll 


51.7 


439 


lectin c 


Lectin C-tvne domain 




1 U.J 


440 


2f-C2H2 


Zinc finger, C2H2 type 


l.le-65 


231.6 


441 


arre^tin 




7 Oo 7^/1 


O^O.i 


442 


aminotraTi ^ 


ATninn+rsncfprncpc r*lacc_TTT 
nvri d oxal-nh o 


o«6e-oU 


"ill 1 


443 


UCH-1 


XJbiQUitin carboxvl-tPTminal 

hydrolases famil 






444 


CTF NFI 


CTF/NF-I family 


2.6e-277 


0'?4 fs 


451 


T-box 


T-box 


3.8e-117 




453 


Rieske 


Rieske [2Fe-2S] domain 


2.6e-13 


^7 7 


454 


zf-C2H2 


Zinc finger, C2H2 type 


3.9e-64 




456 


homeobox 


Homeobox domain 


2 Re-OR 


JO.-' 


459 -. 


ig ' 


Immunoglobulin domain 


2.6e-20 


7ft S 

/V.J 


460 


Hydrolase 


haloacid dehalogenase-like hydrolase 


4e-25 


0^ 0 


462 


rve 


Integrase core domain 


1.6e-13 


50.7 


466 


CH 


dalnonin HmnnlnoA/ ^/""HT^ rlftmain 


7 4p 1 7 


71 1 


467 


CH 


CalDOnin hrtmrtlnffv fC*^^T\ HftTnain 


7 4p 1 7 


71 1 


468 


Sterol desat 


Sterol desaturase 


7.5e-38 


139.2 


469 


pro_isomerase 


Cyclophilin type peptidyl-prolyl cis- 
tr . 


2.6e-63 


220.9 


470 


Peptidase M24 


metallopeptidase family M24 


6e-08 


28.1 


471 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


5.4e-129 


441.9 



190 



wo 01/57190 



PCT/DSOl/04098 



SEQID 
'NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


479 

H 1 J, 


III y D_JL' IN /\- 


Myb*likc DNA-binding domsin 


J.oe-Uo 




473 


77 




U.UiZ 


9n n 


474 




conserved doma 


D.JC-Oo 


JU-).-) 


475 


iKricnrnJil T '5 1 P 


RihoQftTTinl nrAtPin T '?1p 


o. ic-oo 


9'39 ^ 




Pin 


Pin Hr\Tn!J in 




O^^'l 9 
zOj. / 


ill 






1. ic-lZ 


JJ.O 


47R 




iiiu<i/\ / uiiij / .pc)C|j^ lamuy 




-1 /. / 


479 


FYVE 


"P W IP 7mf* fiTlOPr 

X X V X_< ^llil^ liiJgCJ 




/o.O 


480 


DNA no] A 


r^NA nnlvmpf^cp fiimilv A 




1/^7 4 


482 




cliArt f*Viain flpVivHmcrprmcp 


1 9p-fi9 


991 A 


483 




Aniv' rpripat 


1 "^p-l? 


71 0 

/ 1 .y 


484 


IMS 


iTTiT\R/miipTl/caTYiR faTnilv 




9on *; 


486 


TIR 


TTR dnmain 




O / .o 


487 




r lav lll~UillUUlg lIlUllUUAygCllaaC~lLK.C 


n 

V 


1A9^ ^ 
I^Zj.J 


488 


I LWEQ 


T/T WFO rfnmain 


Q ^p im 




495 


homeobox . 


Homeobox domain 


2.6C-06 


30.8 


497 




FnlfJiTT/AtiP T^rrt+AiTi Utiqca HrttnatT^ 
I^UKaiy\jLl\^ pr UlcUl KUlabC UOLUalll 


9 "^p i/=;a 

Z.Jo- 100 


jOO. 1 


4QQ 


MID 


r juruiiCLriui Lypc lu uuiudin 


9 Kt> 9*37 
Z. Jc-Zj / 


oUl.o 


501 


T RR 

JLiINJX. 




y. je- J 1 


1 i J.O 


502 


RGS 


Regulator of G protein signaling 
domain 


0.041 


11.9 


503 


filament 


Intermediate filament proteins 


le-142 


487.5 




ITiJ 


Fibronectin type III domain 


1 1 Art 


347.7 






HECT-domam (ubiquitin- , 

ITallSIcraSc J, 


la 1 O 


59.0 


'5(17 


rviOObUULal Li IJ\ 

Q 


ivioosuniai proiciii Li iJ\c 


3. /e-zo 


OO 9 


508 


WD40 






ICR 


509 


WD40 


\X/T^ HnmaiTi P-Kp+ii rpripat 




1 0 R 


510 


WD40 


WD domain, G-beta repeat 


2.1e-42 


154.3 


511 


■nlrinacp 


t^'iilf ai*vrttic TM'otpT'n Iririocp rlAtnaiti 
xjrUi<k.aJ ^ULIU piULClU xvlXIOoC KlvillalU 


Z. JC-OO 




512 


fr-oarmnn 


OPT Hninairi 


1 Qp-flR 




513 


SH3 


Hnmain 

Oxxj UUJllalll 




"34 9 


515 


HTH AraC 


Raptpnsil rPCTiilntnrv TipliY-tiim— VipIiv 

XjaULCl Im J CgLXlatUi jr ilCLlA~LUlll~liCllA 
pi ULCl 


<3p-97 


lUj.O 


516 


zf-C2H2 


7inr fincpr COViO tvnp 

Zjiii\/ liiigvi, v^^xx^ ^yp^ 




19R n 


517 


SI 


^1 RIO^A hiTiHincT Hr\TifiaiTi 

iJX X\A^j\ UillUlilg UUlllaUl 




9n^ 0 


518 




PiiL'sirvrttip T^rAtAiT* Irinstcp ^Irtmoiti 
JCrllivai yULl^ piULClii A.lli<laC UUillollL 


1 Rp 7S 
1 _.oc- / J 




525 




X-^aULllvl Jxl UUllidUl, 


9p-Rn 


9Rn 

ZoU.O 


528 


zf-C2H2 


^uiu lui^ci, V— 'X-xxi. type 


dp 7n 


9AA 4 


529 


TiPiir phan 


^Upiii"f\'t7'!incm ittPr—crntpH J An—plianTi/»l 


< Rp.999 


7^n 8 

/3U.O 


531 


RhoGEF 

XxXl V ViiX-fX 






9 

10U.Z 


532 


mvniin VipaH 


IVljvolU llCdU ^liiUlUI KSUllLalllJ 






533 


LRR 


T PiiPinp Rirli Rpnpat 


R '5p-1 S 


A9 A 
OZ.O 


535 


Sec7 


Sec7 domain 




J ly. 1 


536 


homeobox 


Homeobox domain 


4.8e-05 


26.4 


539 


3ctin 


A ptin 


9 ap 1 nn 


JjU.O 


542 


ank 


Ank repeat 


1.9e-35 


131.2 


544 




ZjUii; linger L'-ao-Lz-xd-c-Xj-xi lyps 


9 Co 1 n 


A 1 9 
41./ 






Dual specificity phosphatase, 

ualaiyLlC UOma 


Z.4e-4U 


14 /.4 


547 


HMG CoA synt 


HvdroxvmethvlolutflTvl-cnenTvme A 
synthas 


0 


1250.8 


549 


laminin G 


Laminin G domain 


3.3e-76 


266.6 


551- 


PHD 


PHD-finger 


0.008 


9.3 


552 


PDZ 


PDZ domain (Also known as DHR or 


0.0017 


25.0 
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OilfV^ MJJ 

'NO: 






p-value 


rr AM 
SCORE 






GLGF). 






555 


WW 


WW domain 


1.3e-24 


95.3 


558 


kinesin 


ICinesin motor domain 


1.8e-176 


599.7 


559 ■ 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
fineert 


0 00085 


16.5 


563 


efhand 


EF hand 


7.9e-l 1 


49.4 


567 


PH 


PH domain 


7.8e-06 


25 9 


568 


PH 


PH domain 


3.1e-39 


143.8 


569 


Hist desc&tyl 


Hi*!toiie deacetvla'je familv 

XA.kJ^\JXX\^ U^dv^LYluOw icuukiy 


5 2e-106 




570 


PDZ 


PDZ domaiTi rAJ'io known as DT-fR or 
GLGF). 


3 46-20 


80.5 


571 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


le-16 


58.5 


573 


ubiquitin 


Ubiquitin family 


1.4e-08 


31.1 


574 


FH2 


Formin Homology 2 Domain 


1.3e-110 


380.9 


576 


serpin 


Serpins (serine protease inhibitors) 


4.3e-146 


496.4 


579 


zf-C2H2 


Zinc finger, C2H2 type 


5.76-76 


265.8 


580 


pkinase 


Eukaryotic protein kinase domain 


6.9e-79 


275.5 


581 


RhoGAP 


RhoOAP Homflin 




1 80 8 


582 


Rihosomfll r.7A 
e 


Riho*;nnial nmtpiti J IAr 




1 0 


584 


Ijazal 


TCazal-tvoe serine rirotea<;e inhihitor 
domain 


2J2e-52 


1 87 4 

AO/ .t 


585 


LRR 


Leucine Rirh Rpneat 


4 4p-9}! 


1 Oft 7 


586 


PHD 


PHD-Qnger 


3.8e-12 


53.8 


588 


GTPl OBG 

^-l X X. X \^xJ\J 


GTPl /ORG familv 






590 


Do] la pen 


(""ftllappn trinlp Vipliv r^rt^at fOf) 

copies) 




1 'sO 4 


591 


Ivs 


C-type lysozyme/alpha-lactalbumin 
family 


1.6e-3r ■ 


116.4 


596 


ACBP 


Acyl CoA binding protein 


0.0022 


-9.4 


597 


SNF2 N 


SNF2 and others N-terminal domain 


3.7e-98 


339.5 


600 


KRAB 


KRAB box 


1.3e-29 


111.8 


606 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


607 


LRR 


Leucine Rich Repeat 


le-05 


32.5 


608 


WD40 


WD domain, G-beta repeat 


5.3e-23 


89.8 


610 


cpn60 TCPl 


TCP-l/cDn60 chaoeronin familv 


1 .7e-237 


802.4 


613 


THF_DHG_CY 
H 


Tetrahydrofolate 
dehvdrooenase/cvclolivdro 


4.9e-173 


588.3 


617 




RNA recognition motif 


4e-14 


fin 4 


618 


mn 


RNA recognition motif. 


4e-14 


60.4 


620 


cofilin ADF 


binding pr 




■^4 9 


621 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


622 


UCH-2 


Ubiquitin carboxyl-terminal 
Hydrolase family 


5.8e-21 


83.1 


625 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-124 


426.4 


628 


DEAD 


DEAD/DEAH box helicase 


2.5e-68 


219.0 


632 


GST 


Glutathione S-transferases. 


4.8e-26 


89.0 


633 


5 nucleotidase 


5'-nucleotidase 


6.6e-248 


837.0 


636 


LIM 


LIM domain containing proteins 


1.6e-88 


307.5 


637 


pkinase 


Eukaryotic protein kinase domain 


1.5e-73 


257.8 


638 


MSP domain 


MSP (Major sperm protein) domain 


8.4e-09 


42.7 


639 


metalthio 


Metallothionein 


2e-24 


94.6 


641 


zf-C2H2 


Zinc finger, C2H2 type 


6.1e-114 


391.9 


642 


Ribosomal S28e 


Ribosomal protein S28e 


9.3e-48 


172.1 


643 


Ribosomal S5 


Ribosomal protein S5 


8.3e-87 


301.8 


646 


PHD 


PHD-finger 


0.00025 


23.1 


647 


WD40 


WD domain, G-beta repeat 


1.5e-22 


88.4 
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'NO: 


rr AIM INAMt 


I)J£oCKJUr 1 lOiN 


p-value 


PFAM 


648 


Lioase GDSL 


L,ina5ie/Acvlhvdrola-9e with OD^T - 
like motif 




9 9 


652 


zf-C2H2 


Zinc finger, C2H2 type 


4.1e-146 


498.8 


653 


histonc 


Core histone H2A/H2B/H3/H4 " 


1.2e-10 


48.8 


654 


■ zf-C2H2 


Zinc fineer C2H2 tvne 


1 Qe-87 




655 


ras 




u.tc— / / 




657 


zf-C3HC4 


Zinc finoer C3HC4 tvne rRINfi 




4jfi 4 


658 


STphosphatase 


Ser/Thr protein phosphatase 


2.6e-182 


619.1 


659 


zf-C2H2 


Zinc ftneer C2H2 tvne 


1 36-92 


■^91 1 


660 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-85 


297.6 


662 


NDK 


i.^ U^ld.'OlU& LUUliUaUilaLC n 1 1 InirlTj 


1 4p. 1 J 0 


4in 7 


664 


IRF 


Ttitprfprrtii TPOiilatrYTV fciptm* 

transcription f 




70 


665 


4HPPD C 


4-h vdroxvnh en vl n vni va fe 
dioxygenase C term 


1.4e-16 


68.5 


666 


DEAD 


DEAD/DEAH box helicase 


4.8e-74 


237.1 


667 


DEAD 


DEAD/DEAH box helicase 


2.9e-70 


225.1 


669 


pkinase 


Eukaryotic protein kinase domain 


6.1e-93 


322.2 


671 


homeobox 


Homeobox domain 


0.018 


16.5 


678 


cry stall 


Beta/Gamma crystallin 


4.7e-106 


365.8 


679 


WD40 


"WO rffimain fi-hpta rpnpat" 




■?4 0 


680 


Keratin B2 


TCeratin hio'h ^iiilfiir R7 nrntpin 




ISO 


682 


G-gamma 


GGL domain 


8.5e-33 


117.9 


685 


UCH-2 


UUIUUiLiil vClllXlllal 

livrfrnla^p familv 


1 Ati 90 
1 .HC-Zy ■ 


111./ 


686 


Acetvltransf 


Apptvltrnn^i'fpracp mMAT^ fnmil'v/ 






687 


7tm 1 


7 trfln^mpmhranp rp/^ntnr frhn/^rtrtcin - 

1 UcliloilJ&IlXUICUiV IvlrCL/lUi ^llii/UuLfdlil 

family) ■ 






688 


proteasome 


Proteasome A -tvne and R-tvne 


6 56-64 


99S 7 


689 


SCP2 


SCP-2 sterol transfer family 


6.2e-37 


136.1 


690 


TS-N 


TS-N domain 


0.041 


20.1 


692 


zf-C2H2 


Zinc fineer C2H2 tvne 






693 


zf-MYND 


MYND finser 

X*X X 1 XlXXf^wX 




-J.J 


694 


Oxysterol BP 


Oxysterol-biriding protein 


3.9e-133 


4-S'i 7 


695 


PDZ 


PDZ domain (Also known as DHR or 
GLGF). 


1.3e-30 


115.1 


703 


Peptidase C2 


Calnain familv cvciteinp nmtpaQP 




J7U.U 


706 


filament 


Intermediate filament nrntein*; 

XX-tVwl llIWUXULw XllUXXX^XXL L/X vtfWXliO 


7.2e-107 


•JUO.J 


710 


fibrinogen C 


Pihrinnffen beta and tramma rhainQ 
C-term 


/ C— OU J 


97R n 


711 • 


.SH2 


Src homology domain 2 


2.3e-65 


192.1 


712 


ATP-synt DE 


ATP svntha^p r)pJta/Fns:iJr»n rhain 




ion 


713 


ARID 


ARID T3NA bindinp dnmain 


9p-1 7 


71 1 


714 


LBP BPI CETP 


LBP / BPI / CETP familv 


8.6e-34 


19'; 7 


715. 


RNA_poI L 


RNA Dolvmerases L / 1 3 to 1 6 VDa 
subunit 


4.8e-49 




716 


KRAB 


KRAB box 


1.3e-42 


155.0 


717 


mito carr 


Mitochondrial carrier proteins 


4.8e-38 


133.3 


719 


Gal-bind lectin 


Vertebrate galactoside-binding lectin 


1.5e-25 


90.2 


726 


aldedh 


Aldehvde deVivdropenfl«:p familv 




4in R 


728 


Glycos_transf_2 


Glycosyl transferases 


4e-21 


S3.6 


734 


ELM2 


PT K/f7 finmnin 




Ix /.o 


735 


PR55 


submit PR 


A 
\) 


1 v/Jo.Z 


737 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4e-14 


60.4 


740 


WD40 


WD domain, G-beta repeat 


5.6e-14- 


59.9 


745 


2f-C3HC4 


Zinc finger, C3HC4 type {RING 


3.8e-13 


46.9 
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NO: 






p-value 


PFAM 






finger) 






749 


mito carr ■ 


Mitochondrial carrier proteins 


4.5e-67 


232.8 


750 


DUF27 


Domain of unknown fiinction DUF27 


4.5e-12 


53.5 


751 


SH3 


SH3 domain 


3.6e-17 


70.5 


752 


HMG box 


HMG (high mobility group) box 


8.6e-13 


55.9 


753 


SPRY 


SPRY domain 






754 


GTP CDC 




7 '\p- ] 's'^ 




755 


mito carr 


A^^itnrlinTiHrial r.arriftr rirntpinc 

Xm wCUJ.lv'l UlxJlwiUd 


jc~oo 




756 


TSPN 


TTirnmHA^inrtTnii'n "W^-tf^rminal -HVp 

1. Ill UlllLiUSL/uUVlUl 1.^ LWJ 11 1 1 1 ~UJ\,w 

domains 






757 


BTB 


BTB/POZ domain 


5.7e-23 


89 7 


759 


zf-C2H2 ■ 


Zinc finger, C2H2 type 


1.2e-12 


55.4 


760 


NSF 


NSF attachment protein 


6.4e-127 


435.1 


762 


Ribosomal S14 


Ribosomal protein S14p/S29e 


2.1e-06 


24.8 


765 


XhiF family 


TTiiP fiamilv 

X 11 ix x^uiii i y , 




144 


766, 


DnaJ 


DnaJ doTTiain 

JL^11C1«P UvllltLUJ 




I J J.J 


768 


tRNA-synt 2b 


tRNA *ivntheta*ip clai** TT 


9 Ip.SI 


9S1 7 


769 


Idl • recent a 


T fiw-Hpnditv 1 innnrntpin r(»pf*rvtr\r 

JL<U TT UVIIOILV llLf\JiJx\Ji,\^XU. J ^LfCLfLUl 

domain 


0 

V 




770 


WD40 


HnmaiTi ■ fr-hptfl r**npnf 




R4 


771 


LRR 


Leucine Rich Repeat 


3.8e-06 


33.9 


774 


SNF2 N 


oiNjT^ aim Uuicio iN^lcmilllal □QiTifiin 


^ oo 
j.je-yy 




776 


yps9 


Vacuolar sorting protein 9 (VPS9) 

UUlUolll 


l.le-30. 


115.4 


777 


VPS9 


Vacuolar sorting protein 9 (VPS9) 

UUlliOlll 


l.le-30 


115.4 


778 


VPS9 


V alrfUUICU ^Ul Ull^ UX ULCui 7 ^ V JTO? ) 

domain 


lip "^n 


1 1 J.4 


779 




Zinr -finopr P^WPA tvmp fR T>J<^ 


J. le-uo 


"310 


781 


cadherin 


r^adhpTiTi Hrtmnin 

v^ctvjii^j 111 viwiuajLii 


J.UC-l u 




783 


HECT 


H H ( "1 -drtmain ^iihioiiitin- 

J-JJUr^ J. UUiliaLU VUL/lUUlLlil 

transferase). 




1 1 R 
I ID.o 


785 


sushi 


Sushi domain (SCR repeat) 


1.8e-60 


214.3 


786 


sushi 


Su*lhi doTnaiii TRf^l? rf^nftat^ 


1 .oc uv 




788 


vwa 


vnn AVillehrand fttctar tvxtf* A dnmain 

T Ull VT UlwL/J. ClUU l.uVf LUl L Y 1-'^ V-lv ^ 1 1 1 f1 11 1 




1 R7 7 


790 


rrm 




9 Rp.9fl 


RO R 


791 


CciWa p'sn 


(^rillsKTPn tririlp IipIty rf^r\f^^t fJCi 
^VJlJdi^v^Il Ui^XC ilCilA ICUCaL \ 

coniei^ 


n nnn07 

V.UUU7/, 


Q 7 


792 


pkiiias6 


Eulcarvotic nrotein Icina^iP drHnain 




19 4 


795 


zf-C2H2 


Zinc fineer C2H2 tvne 




'^9R 7 


796 


adh short 


short chain dehvdroffena^e 


4 Ip-OS 


-7 


799 


SAICAR synt 


SAICAR synthetase 


6e-125 


d9R S 


805 


WD40 


WD domain G-beta reneat 


4e-65 


999 R 


806 


ZU5 


^U5 domain 


4 7p-'?7 




807 


WD40 


WD domain G-beta reneat 


0.016 


91 R 


808 


WD40 


WD domain, G-beta repeat 


0.0041 


9"^ R 


809 


pkinase 


Eukarvotic nrotein kina<se domain 


2e-31 


117 9 

1 X f ^ 


810 


vwa 


von Willebrand factor type A domain 


1.9e-52 


187.7 


814 


zf-C2H2 






9RQ 4. 


815 


2f-C2H2 


Zinc finger, C2H2 type 


6e-74 


259.1 


817 


TTivo^in heaH 


Ivi^Uaill ilCaU ^lilULi/J UUliiailll 


i.je-1 /o 




818 


GSPII E 


Raf^fPfin] tvr»P TT cf^re^tirm c\/e+<»m 


u.uiz 


1 L.J 


819 


PDEase 


3'5'-cyclic nucleotide 
phosphodiesterase 


l.le-74 


215.5 


821 


PH 


PH domain 


0.00025 


20.5 


822 


CNH 


CNH domain 


0.00015 


-24.7 


827 


rrm 


RNA recognition motif. 


1.5e-06 


35.2 
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atiiKz IX) 
NO: 


rJbAM NAMi!, 


DEaCKLr 1 lUlN 


p-valne 


PFAM 

lJ V \J IV Hi 


829 


HMG box 


HM^G rtiioh mobilitv orniin'^hnY 


7 Se-34 


125.8 


830 


RasGEF 


RasGEF domain 


2.2e-102 


353.5 


831 


CNH 


CNH doinain 


3e-l 18 


406.2 


832 


mito cajT 


Mitochondrial carrier nrntein*; 


3.7e-37 


130.3 


833 


PX 


PX domain 


2.7e-19 


77 S 


837 


Y__phosphatase 


Protein-tyrosine phosphatase 


1.6e-263 


888.8 


838 


Hnk 


XUIA, 1 VL/WOt 




y 1 l.j 


840 


siilc 








842 


Ribosomal L15e 


Ribosomal L15 


4.8e-131 


448.8 






ouuiuiu.Qcui ouansiniiicr symponcr 


u 


IzUi.o 


845 




iiiDuiuio^c ^1 c<|jLiuadC ktiiiiny lyiiKjj 






848 


EFIBD 


PP"-! cnmTiitiP ^^^^^lpfv^"i/^A pvi^Viartcp 

J_fl 1 gUCUllllC IJUL'l&ULlUC CAUlollKC 

domain 




900 7 


849 


2f-C2H2 


Zinc fmger, C2H2 t>'pe 


1.5e-122 


420.5 


850 


zf-C2H2 


7inc finder C7K2 tvne 


9p-fi7 


9^7 A. 


852 


SIS 


jlnmnin 
oiQ uuiiiaiiJ 




I xD.O 


853 


RhoGAP 


T^hnGAP domain 

XxllUxJJ^X UUlllalll 


1 1p-'^7 


X Jo.O 


854 


PD2 


GLGF). 




7 


856 


ACOX 






OOO.J 


858 




FFhanH 


9 Ap 1 9. 




860 


homeobox 


Homeobox domain 


■4e-22 


86.9 






xroiiaC'ripiioi] miuauon lacior iJx, . 

Hpfa 
ucm 




A <0 0 


866 


A2M 


^ipiia~^**XllaWUgIUL;UiUl ialliliy 


A Qp 91 


70 O 


867 


MoCF_biosynth 


Molybdenum cofactor biosynthesis 
protei 


5.8e-205 


694.3 . 


868 


EOF 


EGF-Iike domain 


4.1e-22 


86.9 






EGF-like domain 


■ 1 "X a. OO 

1 . 1 e-zz 


DO O 
OO.O 


871 




r nobpnatiuyiuiosiioi-spcciiic 

}JllVO|^liUllJJOOC 


/ .ze-y J 


JZD.O 


872 


UCH-2 


TThinnitin pprKnvvl— tprmtrisil 
hvdroia*;e familv 


. 1 . i c~zu 


R9 1 
oZ. X 


874 


SH3 


SH3 domain 


2.2e-14 


61.2 


877 


SH3 


■ SM'^ domain 

iji 1^ uuii la 111 


O.OC 7U 


■^117 
J X X. / 


882 


KRAB 


KRAB box 




xoz.o 


885 


ank 


Ank repeat 


7.1e-07 


36.3 


886 ' 


UlUL/LCl ill 11 


dupici lo-ucpcnQcni aiomaiic amino 
acid h 


\j 




887 


QXP FFTIJ 

VJ 1. 1 1^1 1 w - 


l^lUllgaLKJll id^LUi X u idiiixiy 


A Oa 190 


4 J / .J 


888 


zf-C3HC4 


111 ig&i^ 


i.oe-i*t 


J X 


889 


2f-C2H2 


Zinc finger C2H2 tvne 


3.7e-92 


J X 7.0 


890 


ig 


Immunoglobulin domain 


3.8e-06 


24.8 


892 


PTR2 


POT familv 

1 w * ini 11 i 1 jr 






893 


Sulfa ta<:e 




J.JC" /o 


97*3 9 
Z / J.Z 


894 


Sulfatase 


Sulfatase 


3.5e-78 


273.2 


895 


7tra 1 


1 U ollMIlCILlL/l dJIC iCUcpiUl wilOUOpSm 

family) 


4.De-Di 


104.4 


896 


\Jlj\^\J llj'UlU J I 


^jiyouayi nyuroidscs lamijy j i 


U 


Iz / /.o 


897 


wil \JilI\J 


wiruiiio ^v^nmumatm L/rgoniZallOIl 
MOdifipr"! 


o Oo 


ZO.U 


898 


Cbl N 


\^jDi-> pi ulu uLiiAigciic iN-icmunai 
domain 


1 9o 


OOO A 

yzz.4 


899 


vwa 


von Willebrand factor type A domain 


5.5e-32 


119.7 


900 


WD40 


WD domain, G-beta repeat 


2.7e-07 


37.7 


901 


zf-C2H2 


Zinc fmger, C2H2 type 


4e-156 • 


532.1 


903 


ras 


Ras family 


6.6e-101 


348.6 
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NO: 






p-value 


PFAM 


904 


Armadillo seg 


Annadillo/beta-catenin-like repeats 


l.le-06 


35.6 


906 


FH2 


Formin Homology 2 Domain 


4.5e-l 12 


385.7 


907 


Cytidylyltransf 


Cytidylyltiansferase 


1 .4e-05 


29.3 


908 


pkinase 


Eukaryotic protein kinase domain 


1 ^e-64 


228.2 


909 


pkinase 


Eukaryotic protein kinase domain 


8.5e-70 


2453 


910 


pkinase 


Eukaryotic protein kinase domain 


2.9e-42 


153.8 


911 


pkinase 


Eukaryotic protein kinase domain 


1.2e-35 


131.8 


912 


PHD 


PHD-finger 


5.1e-06 


33.4 


913 


PHD 


PHD-finger 


5.5e-16 


66.5 


916 


filament 


Intermediate filament proteins 


9.7e-121 


414 S 


917 


LM 


LIM domain containing proteins 


5.9e-15 


S7 Q 


918 


SAM 


SAM domain (Sterile alpha motif) 


4.3e-16 


66.9 


922 


Acylphosphatase 


AcvlohosDhatase 


2.9e-63 


991 6 


924 




TTnTnunnplohnlin domain 


1 ^p-flR 

i ..PC vo 


19 J! 


925 


Acyl-CoA dh 


AcvI-OoA deVivdropena*ie 






927 


7tm_l 


7 transmembrane receptor (liiodopsin 


2.9e-45 


145.9 


928 


globin 


Globin 




1 Hfi 0 


929 


sugar tr 






Oo.o 


932 


Collagen ■ 


Collaoen irinle heliv r^*npat 

copies) 




0 7 


933 


HMG box 


HMG rtiiffh mobilitv &roim^ hnv 


7.8e-34 




934 


SEA 


SEA domain 


0.0021 


94 7 


935 


ras 


Ras family 




900 9 


936 


CH 


Caloonin homolopv fPT-T^ domain 


3.8e-21 


81 7 


937 


voltage CLC 


Voltage gated chloride channels 


1.9e-199 


676.0 


938 


homeobox 


Homeobox domain 


1 .9e-25 


98.0 ■ 


940 


pkinase 


Eukaryotic protein kinase domain 


9.9e-58 


205.2 


942 


Myosin tail 


Myosin tail 


3.7e-09 


18 9 


943 


2f-C2H2 


Zinc finger, C2H2 type 


2.2e-92 


190 1 


945 


Clat adaptor s 


Clathrin adaptor complex small chain 


1.3e-76 


968 0 


946 


siigar tr 


Sugar (and other) transporter 


0.017 


-199 8 


947 


tRNA-synt_le 


tRNA synthetases class I (C) 


0.00097 


15.6 


948 


PHD 


PHD-fmeer 


1 2e-1 7 


71 9 


951 


sugar tr 


Supar fand otbpr^ tran<?nortpr 


0 00X7 




952 


mito can* 


Mitochondrial carrier nrotpin^ 


1 7p-S4 


1 .80 7 


953 


myb DNA- 
binding 


M^vb-lilce I^NA-hindinp domain 


4 ^^-70 


RO 1 


955 


ketoacyl-synt 


Beta-ketoacvl *;vntha<;p 


7 le-l'^l 
/ . 1 c~ 1 J J 


4'';4 s 


957 


aldo ket red 


Aldo/keto reductase family 




140 8 


959 


Kelch 


Kelch motif 


0.02 


20.8 


961 


ras 


"Ras familv 




1111 


964 


homeobox 


HomeoHoy domain 




OO. J 


965 


PH 


PH domain 

X XX \XWLXXCXXXi 


1p-91 


RO 0 


966 


zf-C3HC4 


Zinc finter C3HC4 tvne TRFNG 
finger) 


9 9p-00 


14 7 


967 


Ribosomal L29 


Ribosomal L29 protein 


1.6e-15 




970 


FAD_binding_2 . 


FAD binding domain 


8.9e-47 


166.6 


971 


rve 


Intepra*;e core domain 


0 000 T) 


10 s 


972 


Glycos transf 2 


■ GIvcowl tran*5fera^M 


X 


R4 S 


974 


Ribosomal LIO 


Ribo"?omal nrotein T Ifl 




1 71 


975 


7tm 1 


7 trJiTi^mPTTihimnp Vpr*»r»tnr ^T4iAHAncin 

family^ 


1 fyR-"^! 

I .oco / 


191 1 


976 


zf-C4 


Zmc finger, C4 type (two domains) 


2.1e-52 


178.5 


977 


zf-C2H2 


Zinc fmger, C2H2 type 


6.6e-150 


511.4 


978 


FTHFS 


Formate— tetrahydrofolate ligase 


0 


1367.2 


982 


Rena]_dipeptase 


Renal dipeptidase 


].3e-73 


258.0 


984 


A deaminase 


Adenosine/AMP deaminase 


2.6e-05 


-48.6 
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TABLES 



SEQ ID NO: 
of fuIMength 
nucleotide 
sequence 


SEQ ID 
NO: of 
full-length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority docket 
number correspondin 
g SEQ ID NO: in 
priority application 


SEQ ID NO: in 
U.S.S.N. 09/496^14 


1 
i 




I7O7 


Z7JJ 


/o/CirZ 1 


150 




"00 


1 Q7n 


00^4 

Z7J4 


/o/L-lrZ Z 


ZJL2 


•3 
J 


ojn 

70/ 


i7/ 1 


Z7JD 


/o/UlrZ J 


\ 654 


A 


700 


1 070 

I7 /z 


Z7JO 


/o/UirZ 4 


0 1 00 

ZiZi 


K 


ORO 

707 


I7 /O 


0OS7 

zy J / 


/o/UlrZ J 




O 


77V/ 


1 074 
I7 /*t- 


Z7J0 


/o/L-LrZ 0 




n 
1 


001 

771 


i7 / J 


Z7J7 


/6/drZ / 




0 
o 


000 

77Z 


1 Q7^ 


00 AH 
Z7OU 


/o/ULrZ 0 




If 


00^ 

77J 


1 077 
1 7 / / 


Z701 


/ 0 / \— ir z y 


/CO 1 1\ 


1 n 


OOil 

774 


1 07 ft 


Z70Z 


/o/UJLrz lU 


0213 


1 1 


00^ 

77J 


1 070 

17 /y 


Z7O0 


'7(?0/*'TD0 1 1 

/o/drz 11 


0257 


1 


00^ 

770 


I70U 


Z7O4 


/o/UirZ Iz 


0294 


13 


QO'7 
77 / 


I70 1 


OQA^ 

zyoj 


/o/Ulrz 1j 


6294 


J H 


OOQ 
7^6 


1 QCO 
I70Z . 


zyoo 


7o/Uirz 14 


6330 


1 < 


000 

777 


i70j 


zyo / 


/o/UlrZ 10 


6364 


1 ^ 
10 


lUUU 


iyo4 


zyoo 


7o/L,lFz 16 


6455 


1 / 


ivJUl 


170J 


zyoy 


7o/dFz_17 


6486 


lb 


lUUz 




iy/U 


757dPz le 


6503 


1 Q 
1 7 


IUUj 


1 00*7 
I70 / 


zy / 1 


/o/\^ir2 ly 


. 6528 




1UU4 


I7OO 


^y /z 


7o/ClFz 20 


6572 


Z. 1 


1 HAS 

IUUj 




zy /J 


7o/CirZ 21 


6578 




lUUO . 


I77U 


zy /4 


/o/drz ZZ 


6593 


0'^ 


1 nm 

lUU / 


1 001 
1771 


zy / J 


/o/UlrZ z3 


6603 


04 


inns 


1 000 

. I77Z 


zy /o 


OQOi^rDO 0/1 

/o/v_'lxZ_Z4 


6603 


OS 


1 nno 


1 00^ 
l77J 


0077 

zy / / 


/o/drZ ZD 


667y 


0/^ 


1 m n 

1 vlU 


1 OQ4 
1774 


OQ70 

zy /o 


ooor^rDo o/c 
/o/drZ Zo 


/^n A A 
6 /44 


07 


lUl 1 


1 00s 
177D 


0070 

zy /y 


/o /L-irZ z / 


6762 


OR 


imo 


1 00^; 


zyoU 


/6/Uirz Zo 


6770 


00 

Z7 




1 007 
177 / 


OQR 1 

zyoi 


ocor^rDO oo 
/o/l^LrZ zy 


6770 


j\j 


lUi4 


1 ooq 
1770 


zyoz 


/o/L-lrz 3U 


6787. 


x 1 
3 1 


lUlD 


1 000 


zyo J 


7o/drz 31 


^0 CO 

6858 


^0 

JZ 


lUlO 


ZUUU 


zyo4 


/o/Clrz ^Z 


6866 




inn 

lUi/ 


onm 


zyoD 


7o/CljrZ 33 


6938 




JUio 


zuuz 


009K 

zyoo 


/o /drz_:>4 


6938 




1 m 0 

lU l7 


onni 

ZUU3 


zyo / 


/o/Cirz 


6977 




inon 


ZuU4 


OORQ 

zyoo 


/o/^UrZ 30 


/OOl 


J 1 


JUZ X 


onns 
zuu^ 


00J!0 

zyoy 


757r^TI>0 11 

/o /UirZ 3 / 




J o 


1 noo 

lUZZ 


zuuo 


ooon 


757011)0 Tfl 

/0/v^lrZ 3o 


/UU4 




lUZJ 


0007 

zuu / 


0001 

zyyi 


/o/\^irJ> 3y 


^A^^ 






OOOR 


0000 
zyyz 


757/^100 /A ■ 
/ 0 / L^lr Z 4U 


/OUO 


4-1 




0000 


OOCi 

zyyj 


7C7r^TT>0 /1 1 

/ 0 / UlJrZ 4 J 


/UUB 


42 




001 0 

ZV/ lU 


0004 

zyy4 


7Q7r*TDO /10 

/o/L/irZ 4z 


/014 


43 




001 1 

Z\J 1 Jl 


000s 
zyyj 


./o/L-irZ 43 


/U21 


44 


1 noR 


0010 

ZU iZ 


000^ 
zyyo 


7P7/^rDO /I /I 

/o/ULrZ 44 


/02z 


4S 


1 000 

J UZ7 


001 


zyy / 


/o/OIrZ 4o 


^AC7 

/057 


46 


1030 


2014 


2998 


787rTP? 47 




47 


1031 


2015 


2999 


787CIP2 49 


7088 


48 


1032 


2016 


3000 


787CIP2 50 


70S9 


49 


1033 


2017 


3001 


787CIP2 51 


7182 


50 


1034 


2018. 


3002 


787CIP2 52 


7489 


51 


1035 


2019 


3003 


787CIP2 53 


7564 


52 


1036 


2020 


3004 


787CIP2 54 


7566 


53 


1037 


2021 


3005 


787Cff2 55 


7587 
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54 


1038 


2022 


3006 


7R7PTP>9 Sfi 

/ O l\^xT^ J\J 


7S01 
Ijy J 


55 


1039 


2023 


3007 


7S7PTP7 ^7 


/ouu 


56 


1040 


2024 


3008 


/ O / V_/li Z. JO 


7^04 


57 


1041 ■ 


2025 


3009 


7R7rTP7 SO 


7*^1 9 
/O IZ 


58 


1042 


2026 


3010 




/0 1 J 


59 


1043 


2027 


301 1 


7R7PTP7 

/Of V-'Xi 0 1 


7i^ 1 S 


60 


1044 


2028 




/ O / V—lxZ. Di£i 


/OIO 


61 


1045 


2029 




7S7PTP7 


7A1 7 


62 


1046 


2030 




7R7PrP7 


/OZJ 


63 


1047 


203 1 


101 *! 


7R7PTP7 


/OZj 


64 


1048 




1016 . 


/ o / L-rUrZ OD 


7*^7 S 
/DZ3 


65 


1049 


2033 


1017 


7R7PTP7 fil 




66 


1050 


2034 


101 s 


7R7PTP9 ^iR 


/Ojo 


67 


1051 


2035 


1010 


7R7PTP7 fiO 


7AAn 

/OhU 


68 


1052 


2036 


1090 


7R7PTt>9 70 
/ o / wia -6 / u 


/o /u 


69 


1053 


2037 


1091 


7R7PTP7 71 


/O /D 


70 


1054 




1099 


7R7PTP7 77 


/Ooo 


71 


1055 


7019 


1091 


7R7PTP7 7^ 


/oyU 


72 


1056 


2040 


1074 


7R7PTP7 74 


770 n 
/ /uu 


73 


1057 


2041 


109S 


7R7PTP7 7S 


777/1 
/ / /H 


74 


1058 


9049 


1076 


7S17PTP7 7^; 


/ / OH 


75 


1059 


904^ 


1077 


75J7PTP7 77 


77D^ 


76 


1060 


9044 


1078 


7R7PTP7 7R 


77 00 

/ /yZ 


77 


lUU 1 




1070 


/o/v^JxZ /y 


77QQ 

/ iyo 


78 


1062 


9046 


1010 


7Q7PTP7 Rf^ 


70A7 


79 


1063 


9047 


1011 


7R7PTP7 Rl 


701 (\ 
/o 1 U. 


80 




9n4R 


1017 


/ 0 /dr/ oz 


70 lO 


81 




9040 


1011 


'VR'TPrDO 

/o/UUrz oj 


TO 1 <C 

/olO 


82 




90'iO 


1014 




/ozo 


83 




90S1 




7O7OTDO QC 


/o4z 


84 


1 uuo 


90*59 


1016 


Tfi^PTDO c/; 
/ 0 / L^lr Z oO 


/oDU 


85 




90^^ 


1017 


/ 0 / y^LrZ 0 1 


/oOl> 


86 


1070 


9054 
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787CIP2C 116 


JtO J 


883 


1867 . 


2851 


3835 


7S7rTP2r 117 


'5409 


884 


1868 


2852 


3836 


787CIP2C 118 


5499 


885 


1869 


2853 


3837 


787CrP2C 119 


J j.^ J 


886 


1870 


2854 


3838 


787CIP2C 120 


5538 


887 


1871 


2855 


3839 


787rTP2C 121 


JJ jj? 


888 


1872 


2856 


3840 


787CIP2C 122 

i \J 1 \_^X.L ^ X 


5558 


889 


1873 


2857 


3841 


787CIP2C 123 


5559 


890 


1874 


2858 


3842 


787CIP2C 124 


5586 


891 


1875 


2859 


3843 


787CIP2C 125 


5619 


892 


1876 


2860 


3844 


787CIP2C 126 


5628 


893 


1877 


2861 


3845 


787CIP2C 127 


5640 
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894 


1878 


2862 


3846 




5640 


895 


1879 


2863 


3847 


787CrP2C 129 




896 


1880 


2864 


3848 


787CIP2C 130 


6094 


897 


1881 


2865 


3849 


787CrP2C 131 


6195 


898 


1882 


2866 


3850 


787CIP2C 132 


6206 


899 


1883 


2867 


3851 


787CIP2C 133 


6355 


900 ' 


1884 


2868 


3852 


787CIP2C 134 


6362 


901 


1885 


2869 


3853 


787CIP2C 135 


6386 


902 


1886 


2870 


3854 


787CIP2C 136 


6431 


903 


1887 


2871 


3855 


787CIP2C 137 


6457 


904 


18S8 


2872 


3856 


7&7CTP2C ns 


6480 


905 


1889 


2873 


3857 


787CIP2C 139 


6497 


906 


1890 


2874 


3858 


787CrP2C 140 

1 VJ 1 V_^XX dL>\_r 1 *T\/ 


6532 


907 


1891 


2875 


3859 


787CrP2C 141 


6598 


908 


1892 


2876' 


3860 


787CIP2C 142 


6644 


909 


1893 


2877 


3861 


787CIP2C 143 


6644 


910 


1894 


2878 


3862 


VH1CTP2C 144 


6645 


911 


1895 


2879 


3863 


787rrP2r i4^ 




912 


1896 


2880 


3864 


787CrP2C 146 


6761 


913 


1897 


288] 


3865 


787CrP'>C 147 


6782 


914 


1898 


2882 


3866 


787CIP2C 148 


6981 


915 


1899 


2883 


3867 


787CrP2C 149 

1 V 1 %^JX At^ 


6981 


916 


1900 


2884 


3868 


787CIP2C ISO 


7000 


917 


1901 


2885 


3869 


787CIP2C 151 


7029 


918 


1902 


2886 


3870 


/ O / ^i.x 1 ^ w 


7SSS 


919 


1903 


2887 


3871 




R143 


920 • 


1904 


2888 


3872 


787rrp2r i s4 


SI 43 


921 


1905 


2889 


3873 


7R7riP7P 1 SS 


Jt 


922 


1906 


2890 


3874 






923 


1907 


2891 • 


3875 




8467 


924 


1908 


.2892 


3876 


/Of \^XX ^ w ± JO 


8540 


925 


1909 


2893 


3877 


7R7rTP?r* 1 SQ 

1 O / ^li ^V— • tJ7 




926 


1910 


2894 


3878 


787rTP2r IfiO 
1 o / y-fLx X u V 




927 


1911 


2895 


3879 


787CrP2C 161 

f \J I \mt X X X w JL 


9669 


928 


1912 


2896 


3880 


787CIP7C 162 


9695 


929 


1913 


2897 


3881 


787CIP2C 163 


9744 


930 


1914 


2898 


3882 


787CIP2C 164 


9849 


931 


1915 


2899 


3883 


787CIP2D 1 


4180 


932 


1916 


2900 


3884 


787CIP2D 2 


4181 


933 


1917 


2901 


3885 


7S7CIP2D 3 

/ U / ^^XX J 




934 


1918 


2902 


3886 


787rTP9'r) 4 


4snn 


935 


1919 


2903 


3887 


787rTP2t) S 




936 


1920 


2904 


3888 


787rTP2n fi 


S^Q1 


937 


1921 


2905 


3889 


787CIP2D 7 

1 O 1 \^X1. At^J 1 




938 


1922 


2906 


3890 


787CIP2D 8 


5882 


939 


1923 


2907 


3891 


787CIP2D 9 

i \j 1 V^XX Jri^ J7 


'6209 


940 


1924 


2908 


3892 


787CTP2D 10 


671 Q 


941 


1925 


2909 


3893 


787CIP2D 11 


8130 


942 


1926 


2910 


3894 


787CIP2D 12 


8863 


943 


1927 


2911 


3895 


787CIP2D 13 


8902 


944 


1928 


2912 


3896 


787CIP2D 14 

/ V 1 \m^XX Ji^Sm^ • X~ 


9162 


945 


1929 


2913 


3897 


787CIP2D 15 

/ \J 1 \^XX ti\ f X ^ 


9197 


946 


1930 


2914 


3898 


787CIP2D 16 

i \J § \^XX^ ff\' X V 


9215 


947 


1931 


2915 


3899 


787rTP2D 17 

1 O 1 v.^XJL ^XJ X 1 


0239 


948 


1932 


2916 


3900 


7S7CIP2D 18 

lot Nb/XX X'X-r X O 


Q?fi7 


949 


1933 


2917 


3901 


7S7CIP2D 19 


9369 


950 


1934 


2918 


3902 


787CIP2D 20 


9371 


951 


1935 


2919 


3903 


787CIP2D 21 


9516 


952 


1936 


2920 


3904 


787CIP2D 22 


9601 


953 


1937 


2921 


3905 


787CIP2D 23 


9731 
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954 


1938 


2922 


3906 


787CIP2D 24 


9733 


955 


1939 


2923 


3907 


787CIP2D 25 


9769 


956 


1940 . 


2924 


3908 


787C1P2D 26 


9804 


957 


1941 


2925 


3909 


787CrP2D 27 


9816 


958 


1942 


2926 


3910 


787CIP2D_28 


9844. 


959 


1943 . 


2927 


3911 


787aP2D 29 


9924 


960 


1944 


2928 


3912 


787CIP2D 30 


9936 


961 


1945 


2929 


3913 


787CIP2D_31 


10163 


962 


1946 


2930 


3914 


787CIP2D_32 


10165 


963 


1947 


2931 


3915 


787C1P2D 33 


10165 


964 


1948 


2932 


3916 


787CIP2D 34 


10244 


965 


1949 


2933 . 


3917 


787C1P2D 35 


10278 


966 


1950 


2934 


3918 


787CIP2E 1 


4251 


967 


1951 


2935 


3919 


787CIP2E 2 


5310 


968 


1952 


2936 


3920 


787CIP2E 3 


5697 


969 


1953 


2937. 


3921 


787CIP2E 4 


5731 


970 


1954 


2938 


3922 


787CIP2E 5 . 


5733 


971 


1955 


2939 


3923 


787C1P2E 6 


5734 


972 


1956 


2940 


3924 


787CIP2E 7 


5740 


973 


1957 


2941 


3925 


787CIP2E 8 


7657 


974 


1958 


2942 


3926 


787CIP2E 9 


9572 


975 


1959 


2943 


3927 


787CIP2F 1 


1363 


976 


1960 


2944 


3928 


787CrP2F 2 


4303 


977 


1961 


2945 


3929 


787CIP2F 3 


5760 


978 


1962 • 


2946 


3930 , 


787CIP2F 4 


5766 


979 


1963 


2947 


3931 


787CIP2F 5 


5767 


980 


1964 


2948 


3932 


787CIP2F 6 


5767 


981 


1965 


2949 


3933 


787C1P2F 7 


5770 


982 


1966 . 


2950 


3934 


787CIP2F 8 


6855 


983 


1967 


2951 


. 3935 


787CIP2F 9 


10026 


984 


1968 


2952 


3936 


787CIP2F 10 


10227 



TABLE 6 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Giutamic Acid, F=PIienylalaninc, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, .Q^GIutamine, R=Arginine, S=Serine, 
T=Thrcon)ne, V=Valine, W=Trypti)phan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
V=possible nucleotide insertion 


2953 . 


A 


3 . 


,324 


ISEHRIEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CGWTILLTLTMVQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTPLFTR*SLCTY 
CPE) 


.2954 


A 


18 . 


467 . 


REELGKDLFDCTLYVLLKYDDFNADKHLALEEF 

YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 

GTLia>PIIWKRNNIILNNLDLEDINDFGDDGSLYIT 

KVTTTHVGNYTCYADGYEQVYQTHIFQVNVPPV 

IRVYPESQARRAG 


2955 


A 


3 


23 


FYSAFLVADKGIVTSKHNNDTQfflWESDSNEFSV 
IADPRGNTLGRGTnT*VSIPPSL 


2956 


A 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWnCFCVWMAAILLSIPQL 

VFYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FVVPFLIMGVCYFITARTLMKMPNIKIS 


2957 


A 


703 


302 


EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLMLKNFFLSPLDTEIKNKVFKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
'corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=<;ysteine, D=Aspartic Acid, 
E=Glutaraic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, t^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginlne, S=Sert"ne, 
T=Tlireonine, V=Valine, W=Tr}'ptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










ETRSLPACWAQWKSLALPVSRAPGRQGSLVVFP 
LP 




A 

A 


J ID 


1054 


CTKCKAIJCDTCFNKNFCTKCKSGFYLHLGKCLJ^ 
NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 
KKGKTCGFKRGTETRVREnQHPSAKGNLCPPTN 
ETRKCTVQRKKCQKGERGKKGRERKRKKPNKG 
ESKEAIPDSKSLESSKEIPEQRENKQQQ 




A. 


i 


4zo 


LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DELTA VWLIFLI\LVLCGFT1.VLLVRIICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 

TM 
UL 


2960 


A. 


1194 


852 


EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSHLSLGTTLP/V*GHLRFRL 
RMLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 


2961 


A 


274 


2250 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 
SLTPPTSVRRMPLITTVTLLKMVARHHMKLLCSK 
AFSTQLQQKIFLHSQMGIHHQSVCMKLJCPNTSHn 
SILMGQPMALVQLETLAPLTinQKFQTQDHMKF 
WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 
: TIQNGRELFESSLCGDLLNEVQASE\Q*NQSffiSRK 
EKRKKSNKHDSSRSEERKSHKIPKLEPEEQNRPN 
ERVDTVSEKPREEPVLKEGSPSSANTIFCSNNGSV 
HWKFQVGDLVWSKVGTYPWWPCMVSSDPQL 
EVHTKINTRGAREYHVQFFSNQPBRAWVHEKRV 
REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 
PQRERAQWDIGLmAEKALKMTREERIEQYTFIYI 
DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 
QPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEP 
PPVIOAWKTAAARKSLPASITMHKGSLDLQKCN 
MSPVVKIEQWALQNATGDGKFIDQFVYSTKGIG 
NKTEISVRGQDRLHSTPNQKNEKPTQSVSSPEATS 
GSTGSVEKKQQRRSIRTR5ESEKSTE\^KKKIK 
KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 
SSVSAAIEETVD 


2962 

■ 


A 


2408 


836 


SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

MPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVLAQNVGTTHDLLDICLICRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFK4KKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLRNLVHSGATKGEISATQDVM 

MEEIFRWCICLGNPPETFTWEYRDKDKNNKKIG 

PMTPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KY>naYTV\EYL\SNMVWRGEKLFYNNQPIDFLK 

KMVAASlKDG\EAywFGCDVGKHFVNSKLG\LSD 

MNLYDHELVFGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKW\RVGEFQWG 

jiJL/non\ivo I i^v-'ivi V i v iijvv/vvv Lyx\jvri 

VP\EEVLAVLGAGNPFVLPAWDPMGALAE 


2963 


A 


90 


543 


RHYDSAGKITLIOAKNYLEQRAVGGASPRLAQS 

VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 

DRMKTTIKETST*LSNSYLVFPLM*SLTYLMKMS 
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SEQ7D 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysfeine, U=Aspartic Acidj 
E=Glutaraic Acid, F=Pfaenylalanine, G=Glycine, H=Histidine, 
I=Isolencine, K=Lysine, I/=Lencine, iVI=Methionine, 
N=Aspanigine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=ThreoBine, V=Valine, 'W=Tryptophan, Y=Tyrosine, 
X=llntuiown, *=Stop codon, /==possible nucleotide deletion, 
V=possible nucleotide insertion 










FERCTAimKMFWSPFTKVDNYCT\SS\WKKFYL 
KCYFSLNTIKKEKKMT 


2964 


A 


3 


2454 


fdtyrglpsisngnysqlqfqareysgapysqris 

aittvsvawkvlsgkigegaegnckcvisegaw 

avcptqpcgkakpdkhlkdllskllnsgyfesbp 

vpknakekevpleeemliqsekktqlsktesvke 

seslmefaqpeiqpqeflnrrymtevdysnkqge 

eqpweadyaricpnlpkrwdmltepdgqekkqe 

sfksweasgkhqevskpavsleqrkqdtsbclrs 

tlpeeqkkqeiskskpspsqwkqdtpkskagyvq 

eehkkqetpklwpvqlqkeqdpkkqtpkswtps 

mqseqnttkswttpmceeqdskqpetpkswenn 

vesqkhsltsqsqispkswgvataslipndqllpr 

klntepkdvp/iacasa*gflplqppfrri/hvlrk 

eklqdlmtqiqgtcnfmqesvldfdkpssaipts 

qppsatpg*prrhlkeqnls\vkviffqgav'nvf 

nvnaplpprkeqekespyspgynqsfttastqtp 

pqcqlpsihveqtvhsqetanyhpdgtiqvsngs 

lafypaqtnvfprptqpfvnsrgsvrgctrggrl 

itnsyrspggykgfdtyrglpsisnghp^^sqlqfq 

aheysgapysqrdnfqqcykrggtsggpransr: 

agwsdssqvssperdnetfnsgdsgqgdsrsmt 

D\ TTAA 7T>1 T'TnKlTl A A 'I'll OA IT V% r\./~m T\^/^%. jm \ T A T^n i A r\ 

PVUVPVTwPAATILPVHVYPLPQQmRVAFSAAR 

TSNLAPGTLDQPIVFDLLLhfNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HR:GArYGSSW 


2965 


A 


3 


2454 . 


fdtyrglpsisngnysqlqfqareysgapysqris 

aittvsvawkvlsgkigegaegnckcvisegaw 

avcptqpcgkakpdkhlkdllskllnsgyfesip- 

vpknakekevpleeemliqsekktqlsktesvke 

seslmefaqpeiqpqeflnrrymtevdysnkqge 

eqpweadyarkpnlpkrwdmltepdgqekkqe 

sfksweasgkhqevskpavsleqrkqdtsklrs 

tlpeeqkkqeiskskpspsqwkqdtpicsicagyvq 

eehkkqetpklwpvqlqkeqdpkkqtpkswtps 

mqseqnttkswttpmceeqdskqpetpkswenn 

vesqkhsltsqsqispkswgvataslipndqllpr 

klntepkdvp/iacasa*gflplqppfrri/hvlrk 

eklqdlmtqiqgtcnfmqesvldfdkpssaipts 

qppsatpg*prrhlkeqnls\vkviffqgav'i\vf 

nvnaplpprkeqeikespyspgynqsfttastqtp 

pqcqlpsmveqtvhsqetanyhpdgtiqvsngs 

lafypaqtnvfprptqpfvnsrgsvrgctrggrl 

itnsyrspggykgfdtyrglpsisngnysqlqfq 

areysgapysqrdnfqqcykrggtsggpransr 

agwsdssqvssperdnetfnsgdsgqgdsrsmt 

PVDVPVTNPAATILPVHVYTLJ'QQMRVAFSAAR 
TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 
CPVNGTYVFIFHMLKLAVNVPLYVNLMKNEEVL 

HRGAIYGSSW 


2966 


A 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 

LGLAYVMANTGVFGFSFLLLTVALLASYSVHLL 

LSMCIQTAYLGP*TNYFMVLPAH*LTCLPLIEFLQ 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
. location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=<;iutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=To'Ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, ' 
V=possible nucleotide insertion 










SL*NSL\*AVTSYEDLGLFAFGLPGKLWAGTrnQ 
NIGAMSSYLLIIKTELPAAIAEFLTGDYSRYWYLD 
GQTLLinCVGIVFPLALLPKlGFLGYTSSLSFFFM 
MFFALWnKKWSIPCPLTLNYVEKGFQISNVTDD 
CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 
LQSPSKKRMQNVTNTAIALSFLIYFISALFGYLTF 
YD/GTTKAQRGEVTCHRIKDKVESELLKG***IP* ' 
SHDVVVMT\VKLCILFAVLL\TVPLIHFPAR^ 
MMFFSNFPFSWIRHFLITLALNinVLLAIYVPDIRN 
. VFGWGASTSTCLIFIFPGLFYLKLSREDFLSWKK 
LGVGCFC/LLSFKTSILRNSLSVYIILPASRKSIYFK 
I ... 


2967 


A 


3 


3222 . 


SGIVWALWREKKPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT . 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVEEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENAIEALKEYEPEMGKVYRADRKSVQRIKARD 

IVPGDIVEVAVGDKVPADmiLAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVTS!QDI<I<m4LFSGT1NI 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLIC VAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTREIMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTDVRSLSKVERANACNSVIRQLMKKEFT . 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEGVIDRCNYVRVGTTRVPLTGPVKEKIMAVIKE. 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKTVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR . 

AIYNNMKQFIRYLISSNVGEWCIFLTAALGLPEA 

LIPVQLLWVNLVIDGLPATALGFNPPDLDIMDRP 

PRSPKEPLI\SGWLFFRYMAIGGYVGAATVGAAA 

wWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLUyvdPPVmnrWLLGSICLSMSLHFLILYVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 


2968 


A 


3 


2414 • 


GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLELQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKJFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKPYLIKITSDWCFSCIHIEPVWKEV 

LGIINGKISFFHNAVVRENLRQFVESLLPGNLVEK 
VTNKNYVRFLSGWQQENKPHVLLFDQTPIVPLL 
YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 
NIYAPTLLVFKEHINRPADVIQARGMKKQnDDFI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, y=Tyrosine, 
X==Unknowo, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










TEINKYLLAARLTSQKLFHELCPVKRSHRQRKYC 

WLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 

AGRWYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDYISD 

CWDSIFHNNWVREMMPLLSLIFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSNLVRLRPGHMNV . 

VLILSN Si Kl sLLQKFALEV YTFTGSSCLHFSFLSL 

DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPDCGKLSKLSL 

WMERLLEGSLQRFYIPSWPELD 


2969 


A 


48 


1117 


KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQnWLFERPHTMPKYLLGSVNKSWPD/YGl 

P/YTSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 

NGTLSASQKIQVTVDDPVTKPWQIHPPSGAVEY 

VGNMTLTCHVEGGTRLAYQWLKNGRPVHTSST 

YSFSPQNNTLHjAPVTKEDIGNySCLVRNPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSHPPNTYSWIRRTDNTTYnKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVnTSVGMCDIQGRDPNKT 


2970 


A 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 
QFQNSSEMEKEPEIGKFGEKAPPAPSHVWRPAAL 
FLTLLCLLLLIGLGVIASMFHVTLKIEMKKMNKL 
QNISEELQRNISLQLMSNMNISNKIRNLSTTLQTI 
A I Ki/CKbL Y iKJiQ tHKCKr CrRR W I w HKDSC i F 

lsddvqtwqeskmacaaqnasllkinnknale . 
fiksqsrsydywlglspeeds/yswyesg*ynq\p 
sawviknapdlnnmycgyinrlyvqyyhctyk 
qrmicekmanpvqlgstyfrea 


2971 


A 


912 


2287 


vpnylpsvssaiggevpqryvwrfciglhsaprf 

lvafa ywnhylsctspcscyrplcrlnfgln w 

enlallvltyvsssedf/twvpg*grsgevfpegt 

glplphsdlptswcghslqcgsqssfppaihenaf 

iwiasslghmlltcilwrltkkhtvsqendglsl 

agaprqprrksrtsvlrirvmvrwelssngnpg 

rgvlglglglgnklrwgqnlgl*hcvwvvwe 

tge*krwrlqmgie*gvasrrq*vrnsvrglvc 

hnssappmymgffsptvfgggvgg*lhvtfilhp 

rbVbAAGlPLLLOrSLPQRQGREHlVVlLAAPACA 

pfhdr*wepreirpsp*elglrgeptlsypascrvi 
rqpip*drksyswkqrlfimfisffsalavyfrhn 
myceagvyitfaileytvyltnmafhmtawwd 
. fgnkellitsqpeekrf 


2972 


A 


1734 


246 


ggilsgrdgrtalprprepaertaglrrdmrpqe 

lprlafplllllllllppppcpahsatrfdptwes 

ldarqlpawfdqakfgihhwgvfsvpsfgsewf 

w w I vv v^iSJiJSJj^iv I V iir jVirvX/iN i x^r^oris. i iiiJrvJr Jw 

ftakffnanqwadifqasgakyivltskhhegf 
tlwg\seyswnwnaidegpkrdivkelevairnr 
tdlrfglyyslfewfhplfledesssfhkrqfpvs 
ktlpelyelvnnyqpevlwsdgdggapdqywn 
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SEQU) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F^Phcnylalanine, (>=Glycine,H=Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, M^'Methionlne, 
N=Asparagine, P=Prolinc, Q^lutaraine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, \V=Trj'ptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDKLSWGY 

RREAGISDYLTEEELVKQLVETVSCGGNLLMNIG 

PTLDGTISWFEERLRQMGSWLKVNGEAIYETHT 

WRSQ>a)TVTPDVWYTSKPKEKLVYAIFLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTIHQMPCKWGWALALTNVI 


■2973 


A 


24 


1133 


SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPR\EPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRVPVPKHWKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 


2974 


A 


271 


1854 


MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKEIRLPKSFSNTQ 

"NSRKEAVLLAKMKHPNIVAFKESFEAEGHLYIV 

MEYGDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCLGVNHIHKRRVLHRDIKSKNIFLTQNGKGKL ■ 

GDFGSARLLSNPMAFACTY\'GTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATILLSRGIVARLVQKCLPPEIIMEYGEEVLE . 

EIKNSKHhnTRKKTNPSRDUALGNEASTVQEEEQ 

DRKGSHTDLESINENLVESALKRVNKEEKGNKSV. 

HLRKASSPNLHRRQWEKNVPNTALTALENASILT 

SSLTAEDDRGGSVIKYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRA.GWQGLCDR 


2975 - 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGD VYK 

AKNVNTGELAAIKVDCLEPGEDFAWQQEIIMMK 

D\GKHP\DIVAYRGSYL\RRJDKLWI\CMERCGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKNIALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSKNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGAN1CSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSEFIPQEMHSTEDENQGTDaiCPMSGSP 

\AKPSQVPPRPPPPRLPPHKi*vALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGrYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end , 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cystcinc, D=Aspartic Acid, 
E^'Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidiiie, 
I=Isoleucine, K=Lysine, L=Leudne, M=MethioDiDe, 
N=Asparagine, P=Prollne, Q=Glutamine, R=Arginine, S==Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X'^Unknown, *=Stop codon./^'possible nucleotide deletion, 
V^possible nucleotide insertion 










RQMQKLPVAIPAHKLPDRILPRKFSVSAKIPETK 
WCQKCCWRNPYTGHKYLCGALQTSIVLLEWV 
EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

DTPQTNVTHVTQLEIUDmVCLDCCIKIVNLQGR 
LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 
MQGRSFRSNEVTQEISDSTRIFRLLGSDRWVLES 
RPTDNPTANSNLYILAGHENSY 


2976 


A 


32 


2833 


PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRKNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAyF\GSYL\RRDKLWI\CMERCGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKhlKWSNSFHHF\nCMALTKNPKKRPT ■ 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAfflaPDRILPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQtSIVLLEW 

EPMQKFMLIKHIDFPIPCPLKMFEMLVVPEQEYP 

T VPVr;VQPriRT^PMrv\A7T?PPT\rKTPXTCTCCTl/T7'rnC 

JLf V ^ vo V oivvii\X/riNts< V vx\_rci viNriNo l oo Wr X Jc.o 

DTPQmVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESrVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVWLES 

RPTDNPTANSNLYILAGHENSY 


2977 


A 


174 


1543 


YSLRKGITFKLAGAMVHIKKGELTQEEKELLEVl 

GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 

AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA . 

LMFAALSGNKDITWVMLEAGAETDWNSVGRT 

AAQMAAFVGQHDCVTIINNFFPRERLDYYTKPQ 

GLDKEPKLPPKLAGPLHKnTTTNLHPVKIVMLV 

NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 

NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 

SLLKG\RASDGFPVYPEKILRESIRK\FPYCEATLL 

or\T \rpcT AP\/T7T/^c'r\'DTAi7C\rT Tr\ ATmt^\Tn.'n\rr\\7 
Vvij VKoJ^vr VilluoJL/r 1 Aro Vl/Hs^AlUjl^ Vur VJJ V 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 
FTHKKICKNLKDrYEKQQLEAAKEKRQEENHGK 
LDVNSNCVNEEQPEAEVGISQKDSNPEDSGEGK 
KESLESEAELEGLQDAPAGPQVSEE 


2978 


A 


3 


5177 


SDDLRTGLFODVODAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRITPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add resldae of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


. Amino acid sequence (A=Alaninc C=Cysteinc D=Aspartic Acid, 
E=GIufaroic Acid, F==Phcnylalanine, G=Glycinc H=Hisfidine, 
I==Isoleucine, K=Lysine, L?=Leucine, M=Methionine, 
iV=Asparagine, P=Proline, Q=GlutamiDC R=ArginiDe, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^'possible nucleotide deletion, 
V=possible nucleotide insertion 










VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMTVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

WKPFSEFGQMAVSSDWEKLLDCTVIVDSVFVN 

LGQHWHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS . 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQinCGRQIICSYL 

SQSIELKVVQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSIVlQVPSSNSSnYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPniHLEKRSLGLSETQnP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKniPPNFQEAFQIGIYWANTNTVHKS VAK 

LVHNLTSPKWKDGGNGEWTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYXPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' . 

QMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVnHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECSIHHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDW 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLIS 

NKELEEYKEKCFIKLCITLNEGKSILCDINEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTSITNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

T /~'T5'DT7\7TJTv A A T r^■\n/^ \rc> /^c^^/^t^trr?/""/^! t t T^T;\rT 
LOKj^tVHMALUV VLVKuauVjbHbGCLLLTSBVL 

FWSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 


2979 


A 


255 


2673 


AWLFPASVLCPRCLTGSAVGSAEWKSLWLFPFS 
SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPHIG 
>nnU.LKTIGKGNFAKVKLARHILTGKEVAVKnD 
KTQLNSSSLQKLFREVRIMKVLNHPNIVRLFEVIE 

TtJI^TT VT \/^>rPV A Cr;nP\7T7T^VT VATJi^PAAT^TTVTT A 
1 CJS. 1 J_» I J-f V iVLD I A oOOH V rUi L>\ AnOxUVUsJliSJiA 

RAKFRQIVSAVQYCHQKFrVHRDLKAENLLLDA 
DMNIKIADFGFSNEFTFGNKLDTFCGSPPYAAPEL . 
FQGKKYDGPEVDVWSLGVELYTLVSGSLPFDGQ 
NLKELRERVLRGKYRIPFYMSTDCENLLKKFLE- 
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SEQID 
NO: 


Method 


Fredicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cystcine, D=Aspartic Acid, 
E=Glutaniic Add, F=Phenylalanine, G=Glycine, H==Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q==Glutamine, R=Arginine, S=Serine, 
T=Tlireomne, V=Valine, ■W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /possible nucleotide deletion, 
V=possibie nucleotide insertion 










NPSKRGTLEQIMKDRWMNVGHE\DDELKPYGEP 

LPVDYKDPRRTELMVSMGYTREEIQDSLVGQRYN 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRJKASSTAKVPA 

SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLL\E 

RASUGQGFHPEWAKTALTTVDPGSRASTASAS AA 

VSAARPRQHQICSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 

V IrAorbunov^vjKKOAoOblroKj' lolvr VKKNLNt 

PESKDRWETLRPHVWNSGGNDKEKEEFREAKPR 

SLRFIWSMKTTSSMEPNEMMREIRKVLDANSCQ . 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRISGTSMAFKNIASKIANELKL 


2980 


A 


120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 

TKLNERVKTMCLEEALNLAVMEFHNSLXQDFLNWLT 

QAEQTLNVASRPSLILDTVLFQEDEHKVFANEVN 

SHREQUELDKTGTHLKYFSQKQD WLDCNLLIS V 

QSRWEKWQRLVERGRSLDDARKRAKQFHEA W 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEA\ ■ 

LLFSGQFTDALQALID WLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSVVHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV ' 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA . 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

\ro A'l"l"ri>'I/'TT ■LTDT 'TT?'K.TV*^li^r>\17T TT^TOIj^lifCTTi/'^V A A 

vrAi 1 IrJSJLJorJLlKJN luKJ^WLlNoKMalrCJi^ 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKXTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTFRAGSRPSTAKPSBOPTPQRKSPASKLDKSSKR 


.2981 


A 


120 


3433 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 
LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 
OKGOOMLARCPK<?AFTMTnnr)TN>JT KPKWF^VP 

TKLNER\KTaaEEAL>}LA\MEFHNSL\QDFINWLT . 
QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 
SHREQnELDKTGTHLKYFSQKQDWLDCNLLISV 
QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 
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SEQH) 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^'Alanine OCysteine, I)=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I/=Leudne, M=Metbionine, 
N=Asparagine, P=Proline, Q=Gliitaniinc, R=Ai^inine, S=Serine, 
T=Threonine, V=Valine, W=Tr}'ptophan, V=Tyroslne, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
. V=possible nadeodde insertioa 










SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNiaLEEA\ 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA " 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLroQHKEFMKKLEEKRAE 

LNKATTMGDWlAICHPDSITTIKHWrniRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSfflPV 

LDKGRAGRJKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFBLADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VrAl 1 IrJsJLHrLlKJNYUisJ'WLlNoKMalrCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLITTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 


2982 


A 


1 


2065 


MAAGGAEGGSGPGAAMGDCAEKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSIDLNKPI 

DKRIYKGTQPTCHDFNQFTAATEnSLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL 

KQ\AWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYWTGGEDDLVTVWS 

FTEGRWARGHGHKSWVNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PLARTRTLPGTPGTTPPAASSSRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

A 1 L J LQbKi<iJKu AbK£HKK.YH.SLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PLLEPLVCKKIAQERLTVLLFLEDCnTACQEGLIC . 

TWARPGKAFTDEETEAQTGEGSWPRSPSKSVVE 

GISSQPGNSPSGTVV 


2983 


A 


3855 


220 


RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 
LQELNANLSNLTSAFBKATAEKIKCQQEADATN 
RVILLAMRLVGGLASENIRWAESVENFRSQGVTL 

V^VJJ-/ V JL/JL*10/\r Vox voir l JSJV I J\JN JGi^lVJLCISJ Wlx I i 

HNLKVPIPITNGLDPLSLLTDDADVArWNNQGLP 
SDRMSTENATBLGNTERWPLIVDAQLQC5IKWIKN 
KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 
GETVDPALDPLLGRNTIKKGKYIKIGDKEVGVPP 
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SEQH) 
NO: 


Afethod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanlne, G=Glycine, H=Histidine, 
I-Isoleucinc, K=Lysine, L=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X^Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 




• 






QVPPDPTHQVLQPTLQARDAGSVH\LINFLVTRD 

GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 

IVLKELEDSLLARLSAASGNFLGDTALVENLETT 

KHTASEIEEKWEAKJtEVKINEARENYRPAAER 

ASLLYFILNDLNKINPVYQFSLKAFNWFEKAIQR 

TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 

KLIFLAQVTFQVLSMKKELNPVELDFLLRJFPFKA 

GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDl 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LCMVRCLRPDRMTYAIKNFVEEKMGSKFVEGRS 

VEFSKS YEESSPSTSIFFiLSPG VDPLKDVEALGKK 

LGFTIDNGKLHNVSLGQGQEWAENALDVAAEK 

GHWVILQNIHLVARWLGTLDKKLERySTGRHED 

YRVFIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAWAERRKFGAQGWNRSYPFNNGDLTISI 

^A^YNYLEANPKVPWDDLRYLFGEIMYGGHITD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 

NLDYKGYHEYIDENLPPESPYLYGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

KAVLDDILEKIPETFNMAEIMAKAAEKTPYVVV 

AFQECERMNBLTNEMRRSLKELNLGLKGELTITT 

DVEDLSTALFYDTVPDTWVARAYPSMMGLAAW 

YANLLLRIRELEAWTTDFALPTTVWLAGFFNPQS 

FLTAMQSMARKNEWPLDKMCLSVEVTKKNRE 

DMTAPPREGSYVYGLFMEGARWDTQTGVIAEA 

RLKELTPAMPVIFIKAIPVARMBTKNIYECPVYKT 

RERGPTYVWTFNLKTKEKAAKWILAAVALLLQV 


2984 


A . 


2 


1464 


FVLFPGIAMETPGASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERN-IRQIAIKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGWGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVEtWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQWHKNTRFLRDPFSQALSR . 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATPSPAPGVNESPMMTWGEVENTPLRV 

EGSETPYVDRTPGPAFKILEPGRRERLGLKMANE 

AAAKNRAKKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 


2985 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

LnjouUl ONIJDQ 1 GAACr&Klj YKVJNVjJJKCiGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVTIESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'=Alanine C=Cysteine, D=Aspartic Acid, 
E=Giufamic Acid, F=Phenyla!anine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucint, M=Methionine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=TjT0sine, 
X=Unkno\vn, *'=Stop codon, A=possible nucleotide deletion, , 
V=possible nucleotide insertion 










WFDGKEFSGNPnCVSFATRRADFNRGGGNGRGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 
APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 
RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 
DSRGEHRQDRRERPY 


2986 


A 


1890 


178 


ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

M4TIFVQGLGENVT1ESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 
APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 
-RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 
DSRGEHRQDRRERPY 


2987 




1376 




WAGARQHGRNWRKRETSPGTQGPLPPVPR/VPP 
GPDG\PHAIAPTLSWAIPRQQCSPQPGRLNALPPD 
RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 
CPPQEPSLRSSRNRLREGQTFGRMEl 


2988 


A 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSHLLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCWDYGGSSSTEN 

A VTA TP PT PnPT rJPT \7 A \/ A Cr*UC ATT CWI A APP/^ 

RPLGTAIWGFFVCWAPYHLLGLVLTVAAPNSA 
LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 
LRRSLPAACHWALRESQGQDESVDSKKStSHDL 
VSEMEV. 


2989 


A 


27 • . 


.4074 


KSQLFCFWVGKAGDDLSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDDIQKGNTKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPLSSGISTPVTNVSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQMVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYID YEEEEMETVEOSTORIKEFROLVT A DMO A 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEIELQQQTIESLKEKIYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locstion 

corresponding 

to (irst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Asparlic Acid, 
E=Glu(amic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, lr=Lencine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glatamine, R<=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 




r 






LVFSKWEAWQTRDQMVGSHMDLVDTCVGTS 

VETNSVGISCQPECKNKWGPELPKINWWIVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQVHQFTNTETATLIESCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGYGTLL 

SGHSGFDRPSAVKTKESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYIERIQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELKNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACTNNESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

DVLRYVINLADGNGNTALHYSVSHSNFEIVKLLL 

DADVCNVDHQNKAGYTPIMLAALAAVEAEKDM 

P r\rCPT 'ViCXC'C^TWT^S a V a Cr\ a CXC^T a T IVyn A A/OU/^D T 
xvj V UllLr OL/VJJJ V IN AJVAbv^Aul^ i AJLIVUL/A V oriOKJ 

DMVKGLLACGADVMQDDEGSTALMCASEHGH 
VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 
KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 
RGSFD 


2990 


A 


6? 


1687 


ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 

AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 

RGGSAQTGHQHPGPYECQCPGPQPGGTTP ALLSL 

ELEETRGPPASANPDKDHSTQPGTMGRKKIQISRI 

LDQRNRQVTFTKRKFGLMKKAYELSVLCDCEIA 

LUFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 

TNTDILETLKRRGIGLDGPELEPDEGPEEPGEKFR 

RLAGEGGDPALPRPRLYPAAPAMPSPDVVYGAL 

PPPG\CDPSGLGEALPAQSRPSPFRPAAPKAGPPG 

LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 

GPRGGLNTSRSLYSGLQNPCSTATPGPPLGSFPFL 

j^OOr Jr V O AIIA W Ajnjs. V r v^Jr AA 

LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPP\ 
CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 
\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 
GLLSFFLFVCISTNKNARGVRGPEKK 


2991 


A . 


■3... 


1159 


IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPLSSLPDBCKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

>ryNKLKNTLRNLNLHTVCEEARCPNIGECWGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYWLTSVDRDDMP 

DGGAEHIAKTVS YLKERNPKILVECLTPDFRGDL 

JsAUiJ^. VAIjOvjJLIJ V I ArliN Vtl VrtLl^oK.VKL'rKA 

NFDOSLRVLKHAKKVOPDVISKTSTMLGLGENDE 

QWATMKALREADVDCLTLGQYMQPTRRHLKV 

EEYITPEKFKYWEKVGNELGFHYTASGPVLVRSS 

YKAGEFFLiCNLVAKRKTKDL 


2992 


A. 


3 


1636 


PVPGVPTSPPSCCPQDMQGPWVLLLLGLRLQLSL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nacleofide 

locstion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=GIycine, H=Histidinc, 
I=Isoleucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GiutaiDine, R=Arginine, S=Serine, 
T=Threonine, V=VaIinc, W=Tryptophan, Y=Tyrosine, 
X-'Unknown, *=Stop codon, /^possible nucleotide deletion, 
\^passible nucleotide insertion 










GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKNLILFLGDGLGVPTVTATRILKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVTTTRVQHASPAGTYAHTV 

NKNWYSDADMPASARQEGCQDIATQLISNMDID 

VILGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKHQGA WYVWNRTELMQASLDQS 

VTOLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSKNPRGFYLFVEGGRIDHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

GPG YVFNSGVRPDVNESESGSPDYHQQAGWPLS 
SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 
VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 
PLLAGTLLLLGASAAP 


2993 


A 


3. 


685 


DAWARLLBa^WRLFGBCAKPKAPPPSLTDCIGTVD 
SRAESIDKKISRLDAELVKYKDQIKKMREGPAKN 

TS\HYTIQSLKDTKTTVDAMKLGVKEMKKAYKQ 
VKIDQiEDLQDQLEDMMEDANEIQEALSRSYGTP 
ELDEDDLEAELDALGDELLADEDSSYLDEAASA . 
PAIPEGVPTDTKNKDGVLVDEFGLPQIPAS 


2994 


A 


1710 


161 


RRCELTPFUKTLILPKSWGAFPEDWMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPREIFEHGSPS 

YIQVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSnQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 
EEEEEE\HFEVINDEVKWARKHGQPGTPVAIA'n 
QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 
GFEDSMC 


2995 


A 


3 ; . 


924 


SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 
ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 
APATTSSWEWRNPLIASSFSLVKLVLRRQLICNK 
CCPPPCKFGEGKLSKRLKHKDDSVMKATQQARK 
RNFISSKSKQPAGHRRPAGGIRESBCESSKEKKLTV 

Jvv^l^i^DJL'lv I Aiirl V AA 1 XV^ALrls^l-'^u 1 AA WKO\KV 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAVV 
EPMLWNPSGTPKRYSLELGKAIKQKLWEALCSQ 
GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 
SKK 


2996 


-A 


3 


1713 


GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEEL WOD AEOIKRCOEKTTMKLLSR TTFLNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKHNLDUBHNKSNAAKNLDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKIFTQRSHFFAPQKIHT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatioD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosinc, 
X=l)nknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










VEKPHELSKCVNVFTQKPLLSIYLRVHRDEBCLYI\ 

CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 

\CGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQRIHTGERSYICTQCGQAFIQKAHLIAHQRIH 

TGEKPYECSDCGKSFPSKSQLQMHKRIHTGEKPY 

KAFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 
SELITHQRIHTTEKPYKCPDCEKSFSKKPHLKVHQ 
RIHTGEKPYICAECGKAFTDRSNFNKHQTIHTGD 
KPYKCSDCGKGFTQKSVLSMHRNIHT 


2997 


A 


3 


1763 


AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 
FQNISCGIHYLASVFMGVTPHHVCRPPGNYSQVV 
FHNHSNWSLEDTGALLSSGQKDYVTVQLQNGEI 
WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 
YIYDQNTWKSTAVtQWNLVCDRKWLAMLIQPL 
FMFGGPTGIGArrFGYFVSDRLGRRWLWATSSS 
MFLFGIAAAFAVDYYTTMAARFFLAMVASGYLV 
VGFVYVMEFIGMKSRTWASVHLHSFFAVGTLLV 
. ALTG YLVRTWWLYQMILSTVTVPFILCCWVLPE 
TPFWLLSEGRYEEAQKMVDIMAKWNRASSCKLS 
ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 
RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 
FLLGVVEIPAYTFVCIAIVEDKVGRRTVLAYSLFCNS 

/\i_*A^VJ V V IVl V Irv^iVii 1 iJLw V V 1 /\iVl\ V UlVlJjr iOAA 

FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLASIL 
APFSVDLS.SIWIFIPQLFVGTMALLSGVLTLKLPE 
TLGKRLATTWEEAAKLESENESKSSKLLLTTNNS 
GLEKTEAITPRDSGLGE 


2998 


A 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE . 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KLANNGTVLRASHGTKMMTPEVLAEAYGBCKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKLNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

LKNAGKILLLITSSHSDYCRLLCA\YILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

SLDKPGWYSQGNAVHLYELLKKMTGKPEPKVV 

VT7rir^^;\yrHQi^n7P A puvQTvni/t''i'\n rr T3t?T ■pi^Ti'cnT 
I r oi-'oivjiioJ-'irr AJvJi i oiN wn i v iju^cs^Lit\\jiJc\j l 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 

FFMDSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLDYICFTRFSSSNSKTAGYYPNPPLVLSS 

DETLISK 


2999 


A 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQKNQTHRSSLHYKPTTDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH . 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

TGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

P^AAPA<snnT O"?! F'sK'T T<sVRFMr;nMr;<5FFFr)RT 

0/Vrt_r/\iJV^\^l_»\^01-«l-iOJVl_» 1 O V I\_nvl\JJLyiVX\JOX^I-<I-fi-'I\JL 

NATVWKLQPTAGLQDLHIHSRQEEEQSEIMEYS 
VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 
DKNSSQVLGEKVLGrVVQNTKVANLTEPWLTF 
QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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SEQID 
NO: 


Method 


Predicted . 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Ainino acid sequence (A=Alanine C=Cj'steine, D=Aspartic Acid, 
E=Glutaniic Add, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methioninc, 
N=Asparagine, P=Proline, Q^GIutaraine, R^Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=IJnknon'n, *^Stop codon, A^ossiblc nucleotide deletion, 
V=possible nucleotide insertion 










ETVRRETQTSCFCNHLTYFAVLMVSSVEVDAVH 
BCHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 
CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 
LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 
RLWEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

VSYITNLGLFSLVFLFNMAMLATMVVQILRLRPH 
TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 
LWLYLFSnrSFQGFLIFIWYWSMRLQARGGPSP 
LKSNSDSARLPISSGSTSSSRI 


3000 


A 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDBCEPLIELFVK 

AGSDGESIGNCPFSQRLFMILWLKGVVFSVTTVD 

LKRKPADLQNljy»GTHPPnTFT<JSEVKTDVNKffiE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

VTTfXTCP'DTJ AMITAT ■pT?r^T T VTl rWI r\T7VT "KTCT>T T>r^ 

I JJvlN oxvrilAr^EAljliJ\,OJLJL«A. i. JLv^lsXrL'C' i L»1n or^J_.r JD 

EIDENSMEDIKFSTRKFLDGNEMTLADCNLLPKL 
HTVKWAKKYRNFDffKEMTGIWRYLTNAYSRD 
EFTNTCPSDKEVEIUYSDVAKRLHQVKSRLLKE 
VSFMSSP 


3001 


A 


779 


2006 


LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

rlr r o V W or AUr oi^r AOr LLrl-b JJ 1 y u W WOrN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 
*SAVAV*PCPRGAHSLERAARQYT1SGSSTSQSGK 
CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 
AFQEGSHLGEGHL 


3002 


A 


909 


2799 


YEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKARGPFWQEDWVWDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSDACGKGFNHSMEVIHGRNPVREKPYKY 

PESVKSFNHFTSLGHQKIMKRGKKSYEGKNFENl 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

LVLHHRTHTGEKP YTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNSSLILHQRTHTGEKPYRCN 

ECGBa»FTDISHLTVHLRIHTGEKPYEeSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

rVHQKIHSGEKPYECKECGKTFffiSAYLIRHQRIH 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 
KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 
SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 
VFRRRVLDLTPLWSVEKNPLSYPVN 


3003 


A 


2 


1489 . 


SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDKLASYLDKVRALEEANADL 

EVKIRDWYORORPSEIKDYSPYFKTIEDLRNKnA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LAYLRKNH'EEMLALRGQTGGEVNVETDAAPG 

VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nacleotide 

location 

corresponding 

to Tirst amino 

add residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, L=Lcucin^ M==Mcthionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T='nireonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=l]nknown, *=Stop codon, A=pos5ible nucleotide deletion, 
V^'possible nucleotide insertion 










tELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 
QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRPILKEQSSSSFSQGQSS 


3004 • 


A 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAWLYNEERYGNITLPMSHAG 

TGNIWIMISYPKGREILELVQKGIPVTMTIGVGT 

RHVQEFISGQSWFVAIAFITMMIISLAWLIFYYIQ 

RFLYTGSQIGSQSHRKETKKVIGQLLLHTVKHGE 

KGIDVDAENCAVCIEOTKVKDnRILPCKHIFHRIC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GPIS 


3005 


A 


184 


2552 


TMTIHQFLLLFLFWVCLPHFCSPEIMFRRTPVPQQ 

RILSSRVPRSDGKILHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

LFIIDEKTGDIHATRRJDREEKAFYTLRAQAINRR 

TLRPVEPESEFVKIHDINDNEPTFPEEIYTASVPE 

MSWGTSWQVTATDADDPSYGNSARVIYSILQ 

GQPYFSVEPETGIIRTALPNMNRENREQYQWIQ 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPQ ' 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YRIIDGDGTDMFDIVTEKDTQEGUTVKKFLDYES 

rrlytlkveaenthvdprfyylgpfkdtnvklsi . 

edvdeppvfsrssylfevhedievgtiigtvmard 

pdsisspirfsldrhtdldrifnihsgngslytskp 

ldrelsqwhnltviaaeinnpkettrvavfvril 

dandnapqfavfydtfvcenarpgqliqtisavd 

kddplggqkfffslaavnpnftvqdnedntardl 

trkntgfnrheistyllpwisdndypiqsstgtltl 

rVcacdsqgnmqscsaealllpaglstgaliail 

LCnn^LVIVVLFAALKRQRKKEPLILSKEDIRDNrV 

syndegggeedtqafdigtlrnpaaieekklrrd 
iipetlfiprrtp.tapdntdvrdfinerlkehdldp 
tappydslatyayegndsiaeslsslesgttegd 
qnydylrewgprfnklpqkygggesdkds 


3006 


A 


2 


541 


GRVDKTWWGKSVGIMLTELEKALNSriDVYHKY 

SLIKGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDESfTDGAVNFQEFLILVIKMGVAALNSn 

DVYHKYSLKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLDLVIKMG 

VGSPQKKVASYF 


3007 , 


A 


1 


1253 


MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAINVPEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 

CPMKEGNPFGPFWDQFHVSFNKSELFTGISFSAS 

I isJbKl W o^Kr or J\Jtirlr V LALrvjr Ar A " V LiCnrlKr 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 
RIGSDWKNACAMLKDGTAGSHFMASPQCVGYS 
RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAQ 
SVYVATDSESYVPELQQLFKGKVKWSLKPEVA 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
• corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, l>=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=fhreonine, V=yaline, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion ^ 










QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 


3008 , 


A 


3136 


1898 


TARGGGSEPGPTMAANYSSTSTRREHVKVKTSS 

QPGFLERLSETSGGMFVGLMAFLLSFYLIFTOEG 

RALKTATSLAEGLSLVVSPDSmSVAPENEGRLV 

HnGALRTSKIXSDPNYGVHLPAVKLKRHVEMY 

QWVETEESREYTEDGQVKKETRYSYNTEWRSEn 

NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 

FLSSGLIDKVDNFKSLSLSKLEDPHVDIIRRGDFF 

YHSENPKYPEVGDLRVSFSYAGLSGDDPDLGPA 

HVV 1 VlAKC^KolJC^LVrrblKbODlLLLJUHLHODFS 

AEEVraRELRSNSMKTWGLRAAGWMAMFMGL 

NLMTRILYTXVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPILVAR 

TRVPAKKLE 


3009 


A 


93 


659 


D/1lAVAMTAQGGLVANRGRRFKWAIELSGPGGG 

bKCjKbDRubOQCjDSLyPVGYLDKQVPDTSVQET 

DRILVEKRCWDIALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMAWRPIQALM/USATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 


.3010 


A 


2 . 


1041 . 


LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 
VVPECTMASSNTVLMRLVASAYSIAQKAGMIVR 
RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 
LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

qpcpsqysaikeedlwwvdpldgtkeYtegll 
dnvtvligiayegkaiagvinqpyynyeagpdav 

LuK 1 1 Wu VLuLCj At Gr QLlCb VPAGKHIITTTRSH 

snklvtdcvaamnpdavlrvggagnkiiqlieg 
kasayvfaspgckkwdtcapevdlhavggkltd 

IHGNVLQYHKDVKHMNSAGVLATLRNYDYYAS 

rvpesiknalvp 


3011 


A 


291 


1452 


. spqktmrshtitmtttsvsswpysshrmrfitnh 

sdqppqnfsatpnvttcpmdekllstvlttsysvi 

fivglvgniialyvflgihrkrnsiqiyllnvaiad 

lllifclpfrimyhinqnkwtlgvilckwgtlfy 

mnmyisnllgfisldryikinrsiqqrkalttkqsi 

yvccivwnilalggfltmnltlkkgghnstmcf ' 

ri y KJJi<U^N Ais.oiiAlr Nr IL V VMr WLlFLLIILS YIKI 

GKNLLRISKRRSKFPNSGKYATT/^SFIVLIIFTI 

CFVPYHAFRFIYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSCLDPVMYFLMSSNIRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 


3012 


A 


246 


1346 


TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQENFNISRIYGKWYNLAIGSTCPW 

LKKJMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

WHTmT)EYAIFLTKKFSRHHGP'nTAKLYGRAP 

QLRETLLQDFRWAQGVGIPEDSIFTMADRGECV 

rObl^lirbPlJjlPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCOLGYS A GPrMOMT*?!? YFYNGT9M A r 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 

LPIVRGPCRAnQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 


3013 


A 


67 


379 


RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 



230 



wo 01/57190 



PCT/USO 1/04098 



SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^lutamic Acid, F=Phenjialanine, G=Glycine, H=Hisfidine, 
I=Isoleucine, K=Lysine, L=Leucine, IVl=Methionine, 
N=Asparagine, P=Proline, Q=GI«tamine, R=Arginine, S=Serine, 
T=nireonine, V=Valinc W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, >TJ0ssible nucleotide deletion, 
V=possible nucleotide insertion 










VSEEDMVTWEDWMNFYINYYRQQVTGEPQER 

DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 
pppcc 


3014 


A 


1 


373 


GTSWSTLRAVMSASVVSVVSRVLEEYLSSTPQRL 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 


3015 


A 


2 . 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRiaDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

>JLCLLAiaFLDHkTLYFDVEPFVFYILTEVDRQG 

AHTVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

iVrl^JArS) Ytbol<>_Lt.b J VOorbKrLIsULCjKLbYRbY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHYICVTPJOLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT ; 


3016 


A 


2 


1321 


AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQKRKHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKryCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

isJrJUJAro I I1L.01SJL1I0 J VOorJllsJi^ljoL'JLvjlsXfO YKoY 

WSWVLLElLRDFRGTLSKDLSQMTSITQNDnST 
LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 
KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 
AVT 


3017 


A 


38 


704 


EAHPGGQLGSERNGVRMDEDVLTTLKILnGESG 
VGKSSLLLRFTDDTFDPELAATIGVDFKVKTISVD 
yjN I<lA1<LLAi WU 1 AOv^bKr RTL 1 r b Y YRG AQG VIL 
VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 
LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

Arv 1 l_ULj V y l^Ar tcL V JiJvtH^ 1 rvjL W JBoliiNt^W Jvlj 

VKLSHREEGQGGGACGGYCSVL 


3018 


A 


2640 


2861 


APVLILQMVKLSIVLTPQFLSHDQGQLTKELQQH 

VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 

HTSHSG 


3019 


A 


1307 


711 


PGITMAASLVGKKIVFVTGNAKKLEEWQELGDK 

QGPVLVEDTCLCFN/yLGGLPGPYIKWFLEKLKPE 
GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 
RGRTSGRTVAPRGCQDFGWDPCFQPDGYEQTYA 
EMPKAEKNAVSHRFRALLFLOFYFGSLAA 


3020 


A 


1202 


180 


VSCLPTSCKMTTLNNQDQPVPFNSSHPDEYKIAA 
LVFYSCIFnGLFVNITALWWSCTTKKiR.TTVTIYM 
MNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLLAnSADRYMATVQPKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A.^'Alanine C=Cystcine, D=Aspartic Acid, 
EKilutaroicAcid, F=PhenyIalanine, G=Glycine, H=Histidine,- 
I^^Isoleucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










AKELKNTCKAVLACVGVWIMTLrnTPLLLLYK 

DFDKDb rPATCLltlbDUYLKA VlNl VLNLTRLTFFF 

LIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRI 

nTLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNLSTCLDVILYYIVSKQFQARVISVM 

LYRNYLRSMRRKSFRSGSLRSLSNTNSEML 


3021 


A 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 

KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 

RPQRPKNAYDLKKSRISKKPQVPKKPREWKNPES 

QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 

PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 

PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 

LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 

WVIKKLMCEINVMEAVRDIRFLHSEALLAVAQN 

'RWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLA 

TASETGFLTYLDVSVGiOVAALNARAGRLDVMS 

QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 

HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 

TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 

NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 

FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 

RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 

DVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKG 

RSSTASLVKRKRKVMDEEHRDKVRQSLQQQHH 

KEAKAKPTGARPSALDRFVR 


3022 


A 


1 • 


2249 


MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 
LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 
LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 
. LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 
DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 
VRTSKGNTPTQKTHLSEIKMCVPVLKDELPAAEH 
QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 
EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 
APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 
KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK . 
CEKAFTCKNTLVQHQQIHTGQKMFECSECEESFS 
KKCHLILHKIIHTGERPYECSDREKAFIHKSEFIHH 
QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 
GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 
CCECGKSFRQIFNLIRHRRVHTGEMPYQCSDCGK 
SFSCKSELIQHQRIHSGERPYECRECGKSFRQFSN • 
LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 
luiiKi'itCbJbCtjJs.br lKJ\.bDLIQHKRIHrGTRPyc 
GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 
SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 
LIRHRRVHTGEMPYQCSDeGKSFSCKSELIQHRRI 
HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 


3023 


A 


3148 


634 


AAGALRCLAAFPRAEPASRGRQSSPARACAASR 
AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 
VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD . 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 
DSHWWSRFQKGDEPWDDKDFRMFFLWTALFWG 
GVMFiiTLLLKRSGRElTWKDF\T<rNYLSKGVVDRL 
EWNKRFVRV rKiVGKTPVDGQYVWFNIGSVDT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E<=<;iutamic Add, F=PhenyIalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, I^Leucine, M^^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine,. 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknotrn, *=Stop codon, A^ossible nucleotide deletioa, 
V=possible nucleotide insertion 










FERNLETLQQELGIEGENRVPVVYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEBDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QE>ra.NQLLVEMDGFNTTrNVVILAGTNRPDILD 

PALLRPGRFDRQIHGPPDIKGRASFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKXTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRLTTGAQDDLRKVTQSAYAQIVQ 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL 

LEKEVLDKNDMVELLGPRPFAEKSTYEEFVEGT 

GSLDEDTSLPEGLKDWNKEREKEKEEPPGEKVA 

N 


3024 


A 


274 


1455 


LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAnSRVQCRTVALDLRSHGETKVKNPED 

LSAETMAKDVGNVVEAMYGDLPPPIMLIGHSMG 

GAIAVHfASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMQNFLRGRPKTFKSLENAIEWSVKSGQIRNLE 

SARVSMVGQVKQCEGITSPEGSKSrVEGIIEEEEE 

DEEGSESISKRKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPIOLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHED APDKV 

ABAVAlrLlRHKrAbPlGOr'QCVFPOC 


3025 


A 


621 


306 


YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL 
SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 
HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 


3026 


A 


1533 


454 


AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 

YEAHAWIQRILSLQNHEniENNHILYLGRKEHDIL 

SQLQKTSSVSITEHSPGRTELEIEGARADLIEWM 

lOTEDMLCKVQEEMARKKERGLWRSLGQWTIQQ - 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVLKVEKTONEXO-MAAFQRKKKMMEEKLHR 

QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 

PKYGAGIYFTKNLKNLAEKAKKISAADmYVFE 

AEVLTGFFCQGHPLNTVPPPLSPGAIDGHDSWD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 


3027 


A . 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFEFSK 
SMNESMKNQKEFMLMNARLQLERQLIMQSEMR 
ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGAI 
KKKKPAFLVPIVPLSFILTYQYDLGYGTLLERMK 
GEAEDILETEKSKLQLPRGMITFESIEKARKEQSR 

L' L' 1 1 W 


3028 


A 


876 


1226 


AVGKEPESSSTWVRDREGHIRSRRSMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
' nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide' 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Plienylalanine, G=Glycine, H'^Histidine, 
I^koleucine, K=Lysine, l/=Leucine, M^Methiohine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Thrconinc, V=Valine, W=TryptDphan, Y=Tyroslne, 
X=Unknown, *=Slop codon, A^ossiblc nucleotide deletion, 
V=possible nucleotide insertion 


3029 


A 


3 


1731 . 


FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQiCLPELRGVGDPAMISSNTSYL 

SSRGRMKWFWDSAEECiTlTYHMDEYDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NWVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

AITQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLE-ISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVrVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FGTLYTENQDVATAVASLCRYHGLSGLVQYQM 

At^LLKUKlJ.W JjN V Y LrcN HAKJLKAAHTY V SEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML ' 

LWRRFLDNKVLLSFGBCAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 


3030. 


A 


1 


584 . 


PWLPWSDGRAARSSRKCPRSRFPVQVGKMAVST 

Vrb J 5bJ_.ML.ALbKxlS>LJLbrLLi>V 1 brKRr YRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDEILI 

NEPSNDWDIYYWATEAKPAPEIFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 


3031 


A 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 
CWSSCGQHPVQATHRGAVSNSLMLCILKLASQM 
PLEmrVQQMVFMLLSNLALSHDCKGVIQKSNF 
LQNFLSLALPKGGNKHLSNLmWLKLLLNiSSGE 

FmWCFSPANKPKILANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS .■ 


3032 


A 


2 


1242 


GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 

LSRPKIOKKPRTKNTPASASLEGLAQTAGRRPSEG 

NEPSTKELKEHPEAPVQRRQKKTRLPLELETSST 

QKKSSSSSLLKNENGIDAEPAEEAVIQKPRRKTK 

KTQPAELQYANELGVEDEDnTDEQTTVEQQSVF 

TAPTGISQPVGKVFYEKSRRFQAADRSELIKTTEN 

IDVSKffi\nKJ>SWTtRDVALTVHRAFRMIGLFSHG 

FLAGCAVWNIWIYVLAGDQLSNLSNLLQQYKT 

T A VPpr^CT T VT T T A T CTTC A 'Cr\'DTTM? A VTC\7 A roXTC 

LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 
SSVNGSLWEAGIEEQILQPWrVVNLWALLVGLS 
WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 

SS - . 


3033 


A 


3 


1436 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SOSSSDGSCKTAGEMVFVYENAKFGARNTRT^FR 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGn 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMILPLMVASAQSGERE 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspondiDg 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine,G=Glycine, H=Histidine, 
I=Isolcucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serinc, 
T=Threonine, V=Va1ine, W=Tr)'pfophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=^ossible nucleotide insertion 










Ulll V VLl JJULJ V V JJ WJJtJl I rrt^Moilt I ol^Uy 

LYRFFKYIEKRDVAKSVLKERGLKKIRLGIEGYP 
TYKEKVKKIU>GGRPEVry>IYVQRPFIRMSWEKE 
EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 
VMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


3034 


A 


3 . 

■ 


1972 


SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHNRAITHLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAILGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM 

KVLREVKVLAGLQHPNTVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSHFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

KNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSLWDWIVERNKRGREYVDESACPY . 

VMANVATKIFQELVEGVFYIHNMGIVHRDLKPR 

NIFLHGPDQQVKIGDFGLACTDILQKNTDWTNR 

N OJvK If IHL iK V tj 1 i^L Y Aortt^JLbUot i UAKbU 

MYSLGWLLELFQPFGTEMERAEVLTGLRTGQL 
PESLRKRCPVQAKYIQHLTRKNSSQRPSAIQLLQS 
ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 
SQDKGVRDDGKDGGVG 


3035 


A 


110 


1172 


KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK . 

MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 

AVAELKDCDEKKTMEGRSSKVYFTINMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQQ 

TQLNTDLTAFLQKHCHGDVCILNATEWVREHAS 

O I V oKJJ 1 ooor 1 i vjo 1 V v^o V lJL,ir 1 KJ_. Wl i brlrli i 

NKCKRKNILEWAKELSLSGFSMPGKPGWCVEG 
PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 
DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 
QFLNTKGCGDVFQMFLWV 


3036 


A 


1 


2288 


FRFAERRAAAAESDVSApyvIAGRSMQAARCPTD 

ELSLTNCAWNEKDFQSGQHVIVRTSPNHRYTFT 

LKTHPSWPGSIAFSLPQRKWAGLSIGQEIEVSLY 

TFDKAKQCIGTMTIEIDFLQKKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

lEAMDPSILNGEPATGKRQOEVGLVVGNSQVAF 

EKAENSSLNLIGKAKTKENRQSnNPDWNFEKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GILLYGPPGCGKTLLARQIGKMLNAREPKWNG 

PEILNKYVGESEANIRKLFADAEEEQRRLGANSG 

LHniFDEIDAICKQRGSMAGSTGVHDTWNQLLS 

KIDGVEQLNNBLVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

DIKELAVETKNFSGAELEGLVRAAQSTAMNRHI 

V A "sTK VFVDMPlir A OVTR (TDFI A <?T P>JT)TT<rP 

AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 
LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 
EESNFPFKICSPDKMIGFSETAKCQAMKKIFDDA 
YKSQLSCWVDDffiRLLDYVPIGPRFSNLVLQAL 



235 



wo 01/57190 



PCTA)S01/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

Ioc:3tion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding ■ 
to last amino 
acid residue of 
. peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Plienylalanine, G=Glycine, H=Hlstidine, 
I=Isoleucine, K=Lysine, i/^Leucine, M=Methionine, 
N=Asparaginc, P— Proline, Q=Glutamine, R=Argiriinc, S^Serine, 
T=Threonine, V=VaIine, ■W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=^ossible nncleotide deletion, 
V=possible nucleotide insertion 










L\^LKKAPPQGRKLLnGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQLLEALELLGNFKDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD . 


3037 


A 


1 


1347 


MLDTGSEHLNRILKALPALQSAGSEGQNG SAESL 

GEGGTRDSDRARRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTTCQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 


3038 


A 


924 


501 


TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

KLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPLISHSVNILA 

LNWR 


3039 


A 


1263 


111 


ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLLIL 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYFNLSAIKE :. 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFQ 

PEDTCAKLRCSLHASLLWTLNPDQCHPSRKRRA 

AffVPKLSCKNfLCHRHQLFINFRDLGWHKWIIAP 

KGFMAOTCHGECPFSLTISLNSSNyAFMQALMH 

AVDPEIPQAVCIPTXLSPISMLYQDNNDNVILRHY 

EDMVVDECGCG 


3040 


A 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLWSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHfflSR 

KDT 


3041 


A_ 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRTVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNfYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3042 


A 


1015 


175 


GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQDD 
NO: 


Method 


Predicted 
beginning 
nucleotide ' 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenylaIanine, G=GlyciDe, H=Histidine. 
Msoleucine, K=Lysine, L^Leucinc, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Tlirconine, V=VaIinc, W -Tryptophan, y=Tyrosine, 
X=Unluiown, *=Stop codon, possible nucleotide deletion, 
V=possible nucleotide insertion 










KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

GLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFffLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 


3043 


A 


153 


1133 


VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 
ASVIVGHPLDTVKTRLQAGVGYGNTLSCIRWY 
RRESMFGFFKGMSFPLASIAVYNSWFGVFSNTQ 
RFLSQHRCGEPEASPPRTLSDLLLASMVAG WS V 
GLGGPVDLIKIRLQMQTQPFRDANLGLKSRAVAP 
; AEQPAYQGPVHCITnVRNEGLAGLYRGASAML 
LRDVPGYCLYFEPYVFLSEWITPEACTGPSPCAV 
WLAGGA4AGAISWGTATPMDWKSRLQADGVY 
LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 
GFPMSAAMFLGYELSLQAIRGDHAVTSP 


3044 


A 


41 


1316 


PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPHYPGIADGPAALAGFPEAVPA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRRNKKRGIFPKVATNIMRAWLFQHL 

SHPYPSEEQKKQLAQDTGLTBLQVNNWFINARRR 

IVQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAlPASVGMSLNSEGEWHYL 


3045 


A 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLNYLGPPFNEPDFN 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SDFLQKMDDFWLLKELQTWLWRSAKDFNRLKK 

KMQPPAAAVTLHLGAHGF 


3046 


A 


1185 


1584 : 


MYAYMYIGTfflCICAYRGIHIDVYLYMCIYIfflWI 
HTYLCVHIYVYVYlCTfflCMCIHTYVYVYTYMY 
VYTYICLCVYICLCVHIYLCVYIHMYMCTHICMC 
IHTYVHMCICVYIHMYTCVYVYTYTCVYMY 


3047 


A 


811 


132 


SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

NTDAHLDINFKEGLKKERSYTGQFEANVRDEER 

QCGCGVVPDSLLMKVLSQRLDQQDCIQKGWVL 

HGVPRDLDQAHLLNRLGYNPNREFFLNVPFDSI 

MERLTLREODPVTGERYHLMYKPPPTMEIQARLL 

QNPKDAEEQVKLKMDLFYRNSADLEQLYGSAIT 


304S 


A 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAEITGRLHQ 
YDGSIWIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPWEVREQAVEGGEVELSCLVPRSR 



237 



wo 01/57190 



PCTAJSO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of . 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glulamic Acid, F=Phenylalanine, G-=GIycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Mcthioiiine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosint, 
X=lInknown, *=Stop codon, A^possible nucleotide deletion, 
\°*possible nucleotide insertion 










PAATLRWYRDRKELKGVSSSQENGKVWSVAST . 

VRFRVDRKDDGGmCEAQNQALPSGHSKQTQYV 

LDVQYSPTARIHASQAWREGDTLVLTCAVTGN 

PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLVVYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

nCVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 


3049 


A 


3159 


882 


VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRASEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHEWQAFRAAVATTRGDQESAE 

ANKFQVTDSAAFNALVTPCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDKAYLGSAIQL 

VSCLSETTVLAAVLRHISVLVPCFLTFPKQCRML 

LKRMWVWSTGEESLRVLAFLVLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR, 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVnGCKLIPTARFYPLRMHCIRALT 

LLSGSSGAFIPVLPFELEMFQQVDFNRKPGRMSSK 

PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPWLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

EISGKERLEDLNFPEDCRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 


3050 


A 


870 


182 . 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCS S SCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 


3051 


A 


175 


4330 


NIPRWNFQGKSFGWLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAGIFELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRNIATYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKYIHRDrKGQNVLLTENAEVKLWFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEVIACDENPDATY 

DFKSDLWSLGITAIEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRIQLKDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

STLRRDFLRLQLANKERSEALRRQQLEQQQREN 

libHJCK.QLlJUlRQK.RIEJiQK£QRKJKLEljQQ 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKRKQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
; corresponding 
to first amino 
add residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-=PlienylaIanlne, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucinc, M^^Mcthioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X=Unlmown, ''^top codon, A=^ssible nucleotide deletion, 
V=possible nucleotide insertion 










SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP 

PMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TOIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKKAIDEDLTALAKELRELRIEETNRPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTV AVSDI 

PRLIPTGAPGSNEQY>A'GMVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDLVQQSHSPAGTPTEGLGRVSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAAALFTSELLRQEQAKLNEAKKJSVVNVNPTNI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKKNKLRVYYLSWLRNRILHNDPEV 

EKKQGWITVGDLEGCIHYKVVKYERIKFLVIALK 

NAVEIYAWAPKPYHKFMAFKSFADLQHKPLLVD 

L.L\ iitujy KL.1S. virUbrilOi*HVIJJVJJb(jjNaYL>iyir 

SfflQGMTPHAIVILPKTDGMEMLVCYEDEGVYV 

NTYGRITKDWLQWGEMPTSVAYIHSNQIMGW 

GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 

DKVFFASVRSGGSSQVFFMTLNRNSMMNW 


3052 


A 


1 


615 


MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 
BCLEASAGDPAGSEQEEEGSGGDSEDDGFLDSSA 

nr^^t^ ATT i~ix>vx>VJ i/"^ cf ot/^ a ■dc/^ a Ti\rr A r^xT'r a 
vjuruALLUrKrJs.l^JvljbL010ADiiuArV lAOV 1 A 

PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 

RSQIAHALKLSEVQVKIWFQNRRAKWKRIKAGN 

VSSRSGEPVRNPKIWPIPVHVNRFAVRSQHQQM 

EQGARP 


3053 


A 


203 


2167 . 


FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTVVAAVQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

OTWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEIPTDPSEEPGISTS 

DILSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KISLLLHQRGHAQERPFSCPQCGIDFNGHSALIRH 

GERPFSCPHCGKSFIRKHHLMKHQRIHTGERPYP 
CSYCGRSFRYKQTLKDHLRSGHNGGCGGDSDPS 
GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 
GSGGGVL 


3054 


A 


3 


2212 


SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Plienylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lf=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptopban, Y=Tyrosine, 
X==lInknown, *"=^top codon, /==possible nucleotide deletion, 
V=possible nucleotide insertion 










APWARASFLCHAFQRPLTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDELHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGWDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSkCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRSSNLIQHKRVHTGEK 

PYECSDCGKFFSQRSNLIHHKRVHTGRSAHECSE 

CGKSFNCNSSLIKHWRVHTGERPYKCNECGKFFS 

HIASLIQHQIVHTGERPHGGGECGKAFIRSSDLMK 

JnyKVH 1 OiiKPYbCN bCOKLr bCJboSLNSHRRLHT 

GERPYQCSECGKFFNQSSSLNNHRRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHkPDRPYECSECG 

KAFNQRPTLmHQKIHERERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 


3055 


A 


268 


2954 


arrssssqgsaaptpcqvveasrdqlvagpsgk 

mgnremeeliplvnrlqdafsalgqscllelpqi 

avvggqsagkssvlenfvgrdflprgsgivtrrp 

lvlqlvtskaeyaeflhckgkkftdfdevrleie . 

aetdrvtgmnkgissipinlrvysphvlnltlidl 

pgitkvpvgdOppdieyqirmimqfitrenclila 

vtpantdlansdalklakevdpqglrtigvitkl 

dlmdegtdardvlenkllplrrgyvgvvnrsq 

KDroGKKDIKAAMLAERKFFLSHPAYRHlADRM 

GTPHLQKVLNQQLTNHIRDTLPNFRNKLQGQLLS 

IEHEVEAYKNFKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINRIFHERFPFEIV 

KMEFNEKELRREISYAIKOTHGIRTGLFrPDMAFE 

AIVKKQIVKLKGPSLKSVDLVIQELINTVKKCTK 

KLANFPRLCEETERTVANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HIFALFNTEQRhfVYICDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

MDPQLERQVETERNLVDSYMSIINKCIRDLIPKTI 

MHLMINNVKDFINSELLAQLYSSEDQNTLMEES 

AbyAyKKJLJbMLKMYQALKiALuIIQDIGTATVS 

TPAPPPVDDSWIQHSRRSPPPSPtTQRRPTLSAPL 

AlsJr I ouK^jrr ArAlrorornouArr VrrKrOPLrrrr 

SSSDSFGAPPQVPSRPTRAPPSVPSRRPPPSPTRPTI 

TRPT P5;^T T r> 


3056 


A 


1674 


1839 


WRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 


3057. 


A 


1674 


1839 


wrvtccpparsttertnaydeedcvemvasgg 

WNDVACHTTMYFivlCEFDKKNM 


3058 


A 


3363 


2525 


flvkliliilcrclhslsrsvqqlrtsfqdhavavk 

PT Ml<rVT ONAPnPTT WA<3^Tv/n PMl 1 T PPQPQTfPPT 

lesgavellcgltqsenpalrvngiwalmnmaf 
qaeqkikadilrslsteqlfrllsdsdlnvlmkt 
lgllrnllstrphidkimsthgkqimqavtlileg 
ehnievkeqtlcilaniadgttakdlimtnddilq 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location . 
corresponding 
to last amino - 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylalaniDe, G=Glycine, I]=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=nireonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=UnItnown, *=Stop codon, /^possible nucleotide deletion, ' 
V=possible nucleotide insertion 










KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKLRDMGIVDILHKLSQSPDSNLCDKAKMA 
LQQYLA 




A 
A 




10/ 


SS WPSLSSQMHFPSFHLHV AAHY GRDSFVRLLLE 
FKAEVDPLSDKGTTPLQLAIIRERSSCVKILLDHN 
ANIDIQNGFLLRYAVIK5NHSYCRMFLQRGADTN 
LUKLtUtiV^ 1 PLHJ^S ALRDDVLCARMLYNYGAD 
TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 


3060 


A 


30 


234 


PPLQLDMDPNCYCADGDSCTCAGSCKCKECKCT 

sLlUCSCCSCCPAGCAKCAQGCICKGATDKCSCC 

A 


3061 


A 


428 


720 


VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAlFQLICVLAnVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKRPKKETKKKR 


3062 


A 


1589 


276 . 


WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

\TOSSMK1<!FKAFFRWLYVAMLRMTEDHVLPELN 

PCMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 

FCPADVIGKSMNQAICIPLYRDTRSEDSTRRLFKFP 

FLWNNKTSNLHYLLFTILEDSLYKMCILRRHTDIS 

QSVSNGLIAIKFGSFTYATTEKVRRSIYSCLDAQF 

YDDETVTVVLKDTVGREGRDRLLVQLPLSLVYN 

SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 

WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 

VFEMDroDEWELDESSDEEEEASNKPVKIKEEVL 

SESEAENQQAGAAALAPEIVIKVEKLDPELDS 


3063 


A 


50 


849 


DBCMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SNIPFFIFGPLMMLLMHPYAQKRSRYIYWWVLF 

MnGLFSMYFHMTLSFLGQLLDEIADLWLLGSGYS 

IWMPRCYFPSFLGGNRSQFIRLVFrrTVVSTLLSFL 

RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLISITFPYGMVTMALVDANYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 


3064 


A 


1523 


925 


AATMADGQMPFSCHYPSRLRRDPFRDSPLSSRLL 

DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKDGYVEVSGKHEEKQ 

QEGGIVSKNFTKKIQLPAEVDPVTVFASLSPEGLL 

EEAPQVPPYSTFGESSFNNELPQDSQEVTCT 


3065 ■ 


A 


230 


2929 


LSTSLTGSHLFSLGNHSTRENLNAGNFNFPSEGH 

LVRSTGPGGSFAKHMVAQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKABCEVAEQTLGSGLDSFE 

LIPFKAALRSKMTFHIHAVNNQGRrVPLDSEDSLS 

FVKTACMAVYDffDLLGGNGCLGSWFSESFLTS 

QILVKEKDGTVTTETSSWLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSNLQSWPEBGNVHFFSSGLLFSHCRHGSmSKD 

HGSSNFLMIALFPKSKIYQAFYSEVFSLWKQQDN 
SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 
GEKRSSLKLLSAKLPELDWFLQHFAISSISQEPVM 
RTHLPVLLQQAEINTTHRIESDKVUSIVTGLPGCH 
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SEQID 
NO: 

- 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparfic Add, 
E^liitamic Acid, F=PhenylaIaoine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M==Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=ScriDe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^'possible nucleotide deletion, 
V==possible nucleotide insertion 










ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQG YTDVID WQALQTHPDSNVKASFnGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNWFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

IQSSDG'SPFSGNIYHILGKVKFSDSERTMEVCYNT 

LAnSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 

vfigcslkedsikdwlrqsakqjcpqrkalktrg 
mltqqeirsihvkrhleplpagyfyngtqfvnff 
gdktdfhplmdqfmndyveeanreiekynqele 
qqeyhdlfelkp 


3066 


A 


130 


588 


laplrcqpgtrtqprshpaandpsaamsaagar 

glratyhrlldkvelmlpeklrplynhpagprt 

vffwapimkwglvcagladmarpaeklstaqs 

avlmatgfiwsryslvupknwslfavnffvgaa 

gasqlfriwrynqelkakahk 


3067 


A . 


2 


1016 


efarrrvfiaaremsllrslrvflvartgsypag 

sllrqspqprhtfyagprlsasasskellmklrr 

ktgysfvnckkaletcggdlkqaeiwlhkeaq 

kegwskaaklqgrktkegligllqegnttvlve 

vncetdfvsrnlkfqllvqqvalgtmmhcqtl 

kdqpsayskgflnsselsglpagpdregslkdql 

almgklgenmilkraawvkvpsgfyvgsyvhg 

amqspslhklvlgkygalvicetseqktnledv 

grrlgqhwgmaplsvgslddepggeaetkml 

sqpylldpsitlgqyvqpqgvswdfvrfecgeg 

eeaaete 


3068 


A 


3 


1679 


nsrvwgpwtepsagslrpmarkqnrnskelgl 

vpltddtshagppgpgrallecdhlrsgvpggr 

rrkdwscsllvaslagafgssflygynlswna 

ptpyikafyneswerrhgrpidpdtltllwsvtv 

sifaigglvgtlivpcmigkvlgrkhtllanngfai 

saallmacslqagafemlivgrfimgroggvals 

vlpmylseispkeirgslgqvtaificigvftgqll 

glpellgkestwpylfgviwpawqllslpflp 

dsprylllekhnearavkafqtflgkadvsqev 

eevlaesrvqrsirlvsvlellrapyvrwqwt 

VIVTMACYQLGGLNAJWFYTNfSff GKAGIPPAKIP 

yvtlstggietlaavfsglviehlgrrplliggfg 

LMGLFFGTLTITLTLQDHAPWVPYLSrVGILAIIAS 

fcsgpggipfiltgeffqqsqrpaafiiagtvnwls 
nfavgllfpfiqksldtycflvfaticitgaiylyf 
vlpetknrtyaeisqafskrnkayppeekidsav 
tdgkingrp 


3069 


A 


861 


300 


aagavvsampkakgktrrqkfgysvnrkrlnr 
narrkaapriecshirhawdhaksvrqnlaemg 
lavdpnravplrkrkvxamevdieerpkelvrk 
pyvlndleaeaslpekkgntlsrdlidyvrymv 

llINnvjE.JJ I iSAMAKIJllJ<>JN I Y yjJ 1 rKy lKoJ\IJN V Y 

krfypaewqdfldslqkrkmeve 


3070 


A 


325 


2019 


laepevatdsgqqadlpaeggdpraeascsvlh 
skphamadsrdpasdqmqhwkeqraaqbcadv 
lttgagnpvgdklnvitvgprgpllvqdwftd 
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SEQID 
NO: 


Mediod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding ' 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysinc, L=Leucine, M^Mcthionine, 
N=Asparagine, P=Proline, Q=Glutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
.V=possible nucleotide insertion 










EMAHFDRERIPERWHAKGAGAFGYFEVTHDIT 

KYSKAKVFEfflGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGN\\TDLVGNNTPIFFIRDPILF 

PSFIHSQKROTQTHLKDPDMVWDFWSLRPESLH 

QVSFLFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTDQGKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNR^JPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPNYLHIPVNCPYRARVA>!YQRDGPMC 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRPCR 
LCENIAGHLKDAQinQKKAVKNFTEVHPDYGSH 
IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA . 
ML 


3071 


A 


1 


1187 


SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSWQDSRLDTIFFAKQVINNACATQAIVSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVHNSFARQQMFEFDTKTSAKEEDAFHF 

A/Q V\n>\/XTr5T? T VT3T Fini PT-nDTFiT r;AnvT/~\T»r\W/TC 
Vol VlNOKLl ELJL^ObKJtUJriJULOACN 

AVRPVmmQKYSEGEIRFNLMAIVSDRKMIYEQ 
KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 
NQMLIEEEVQKLKRYKIENIRRKHNYLPFIMELL 
KTLAEHQQLIPLVEKAKEKQNAKKAQETK 


3072 


A 


103 


2775 


RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRREPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDKKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKRLEEQRQKEERRRAAVEEKRRQRLEED 

KERHEAWRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSWT^LTPTHSF 

LARSkSTAALSGEAVIPlCPRSASCSPnMPYKAAH 

SRNSMDRPKLFVTPPEGSSRRRIfflGTASYKKERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

IRPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRELAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARVREEAERVRQEREKHFQREEQERLERKKRL 

PPTXyfVT?TT?PTP ATT^vvxcT^r^DXT/ririT A vr^ 'rr^n- 
CJllIVLNJv I KJv 1 CJK 1 UJvJS. 1 oJJl^KJN OUIAJvLtAL 1 OO 

TEVSALPCTTNAPGNGKPVGSPHWTSHQSKVT 
VESTPDLEKQPNENGVSVQNENFEEUNLPIGSKP 
SRLDVTNSESPEffLNPILAFDDEGTLGPLPQVDG 
VQTQQTAEVI 


3073 


A 


67 


2415 


PPRVCRDHVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
• corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=C)'stcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=Glydne, H=Histidine, 
I=Isoleucine, K=Lysine, L=L£ucine, M-Methioninc, 
lS=Asparagine, P^ProIinc, Q==Olutaroine^ R— Arginine, S^^erine, 
T=Threonine, V=VaIine, W=Tryptopban, Y=Tyrosine, 
X=Unltnown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 


- - 








LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDhnPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSyLRYVPRKFDEGVAS 

APEWDMQKRLHRSVFLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMGNIFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFGYRLA 

SFYEAAIPEMESAUCKRRLEDSKLLGDLWQRLSH 

SRKKLMEIFHIILNQICLLPILESSCDNIQGFIEEFL 

QIFSSLLQEKRPLRDYDALFPVAEDISLLQQASSV 

LDETRTAYELQAVESAWEGVDRRKATDAKDPSV 

lEEPNGEPNGVTVTAEAySQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSWVEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNOVGANDADSDDELISRRPFTIPOVLRTKVPRF 

GQEEDDDDEEDDADEEAPKPDHFVQDPAVLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 


3074 


A 


.3 


251 


GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 


3075 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADiDYINERNAJKJNKKAEREYGKYTAEI 

KQNLERGTAV 


3076 


A 


255 


982 


SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL . 

RKFRELHLMKNEARJCLNHQEVVEEDKRLICLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK , 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQiElCRDKYSRRR 

PYNDDADmYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 


3077 


A 


1 


968 


FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIYTGGATGIGKArVKELLELGSNWI 

ASRKLERLKSAADELQANLPPTKQARVIPIQCNIR 

NEEEVNMLVkSTLDTFGKINFLVNNGGGQFLSPA 

EHISSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSrVNITVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEGSFQKIPAKRIGVPEEVSSWCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSWKKMKETFKEKAKL 


3078 


A 


2 


3508 


FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 
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SEQID 
NO: 


Metbod 


Predicted 
beginning 
nucleotide 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 


. Predicted end 
. nucleotide 
location 

l*m*T*pc nn n fl 1 no • 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Add, 
E=Glutnniic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I^^Isoleucine, K=Lysine, D=Leucine, M^Methionine, 
N==Asparagine, P=ProUne, Q— Glutaminc, R^Arginine, S'^erine, 
T=Threoiiine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=linknown, *=Stop codon,/=possiblc nucleotide deletion, 
V^possible nucleotide insertion 


- 








GQSLPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPAMYLDCISDLRQKEITDGIHSSSDINILYN 

bAVESCIQDPSAEGLSEEVPWFEELPWFEDVA 

VYFTREEWGMLDKRQKELYRD\T^[RMNYELLAS 

LGPAAABGPDLISKLERRAAPWKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSKEEGDGPRRlkRTYRPRSlQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKILQLLQSTGTVILGKYRNRTACTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVOSYITLAPLYSETADGYFETIVSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAWDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEXVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

PTGYSEEALLEEWLGLKTIAQHLPFSMLCKNALA 

OHCRFPLLSK T M A WVr VPTSTSrrFR rJFIf A MTvJ 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 
PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 
RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKD 
CIMEPPERLLYPHTSQEAPGMS 


3079 


A • 


343 


1513 


FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVKLLLLGAGESGKSTIVKQMKIIHEDGFSGED 

VKQYKPVVYSNTIQSLAATVRAMDTLGIEYGDK 

ERKADAKMVCDVVSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

AADYQPTEQDELRTRVKTTGIVETHFTFKNLHFR 

LFDVGGORSERKKWIHCFEDVTATTFrVAI <;r;vn 

Q\a.HEDETThmMHESLKLFDSIQ>JNKWFTDTSn 
LFLNKKDBFEEKIKKSPLTICFPEYTGPSAFTEAVA 
YIQAQYESKNKSAHKEIYSHVTCATDTNNIQFVF 
DAVTDVnAKNLRGCGLY 


3080 


A 


41 


997 


EARTARELTDGVTDGLTMADQPKPISPLKNLLA 

GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 

MYSGTFDCFRKTLFREGITGLYRGMAAPnGVTP 

MFAVCFFGFGLGKKLQQBTOEDVLSYPQLFAAG 

MLSGVFTTGIMTPGERIKCLLQIQASSGESKYTGT 

YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 
GEFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 
RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 
EVAMKFLNWATPNL 


3081 


A 


3 


1996 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
■ corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C-K^stcinc, D=Aspartic Acid, 
E=Glutaniic Add, F=Phenylalanine, G=Glycine, H-Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Prolirie, Q=Glutamine, R=Arginine, S=Serine, 
T=nireonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=lInknon'n, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSPffiSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADDISSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETRIEVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTTRWRIRRDEEGNEI 

KESNARIVKWSDGSMSLHLGNEVFDVYKAPLQG 

DHNHLFIRQGTGLQGQAVFKTKLTFRPHSTDSAT 

HKKJVl I JLoJLAUKL-biS. 1 (^KlKiLrMAGKDPECQRTE 

MnOOBEERLRASIRRESQQRRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAADCNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 


3082 


A 




921 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN • 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISKLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

JMr^A Ir/UilJJJtUUJUlUi^rOolJNhtfcDKJEAAQLREE . 

RLRQYAEKKAKKPALVAKSSmLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3083 


A 


3 


921. 


VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRJFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELWRIASLEVENQSLRGWQELQQAISBCLEA 

RLNVLEKSSPGHRATAPQTQHVSPMRQVEPPAK 

Jvr A I r/\j^UUt,UUUUjLr usiJlS JlttDKxiAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCWEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 


3084 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS . 

DLLDKEFLPILQEEPLPPLALVPIpTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPOTSSSS ARPGTP<?DHn'?nF A SOFP 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide . 
locjition 
corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Pbcnylalaninc, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^^GIutamine, R==Argininc, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 


- 








EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNUPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRE.QKGE.RRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 
QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 
FEAVQSGiCKKKKQKMVRADPSLLGFSVNASSER 
LNMGEIETLDDY 


3085 


A 


128 


4050 


KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSEDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPVVGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYICDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDEMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEEKKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQID 
NO: 


Mctfaod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparfic Acid, 
E=GIutamic Add, F=PhenylaIanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Metliiaaine, 
N=Asparagine, P=Proline, Q=GIutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=TyrosiDe, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 




- " 






PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 
SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 
STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 
LFQGVNKAQDGFTQWCEQMLHALNTANNLDVP 

TTTVQ'RT T(rTH\n?CPVT?\nTr*'VTD A VT nTYTQX^ A VTTC A V 
ir VorJ^rwJtl V Jior I Jfc Vniy 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 
QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 
FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 
LNMGEffiTLDDY 


3086 


A 


675 


1334 


LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 
LEAQIPLCANLVPVPITNATLDRITGKWFYIASAF 

RQDQCrWTTYLNVQRENGTISRYVGGQEHFAH 
LLILRDTKTYMLAFDVNDEKIWGLSVYADKPET 
TKEQLGEFYEALDCLRIPKSDWYTDWKKDKCE 
PLEKQHEKERKQEEGES 


3087 


A 


1 


1575 


CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

VVRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNTVRLHDSISEEGFHYLVFDLVT. 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKFN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

VKPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVH 

LIEAINNGDFEAYTBaCDPGLTSFEPEALGNLVEG 
MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 
AACIAYIELTQYIDGQGRPRTSQSEETRVWHRRD 
GKWLNVHYHCSGAPAAPLQ 


3088 


A 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 
QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 
DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 
SLRNLISQGWAVNnXADHVSPLHEACLGGHLSC 
VKELLKHGAQVNGVTADWHTPLFNACVSGSWD 
CYNLLLQHGASVQPESDLASPIHEAARRGHVEC 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 
LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 
LEREGPPSLMQLCRLRIRKCFGIQQHHKITKLVLP 
FDT If OFT T HT 


3089 


A 


73 


432 


DMAGLMTWTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKRIPENRWSYQLSSRSTCLKAGVIFTTKK 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 


3090 


A 


4627 


611 , 


LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

RAVSHTDVTKKDLKVCVEFDGESWRKRRWIEV 

YSLLRRAFLVEHNT VT AFT?K<?PFT«?FRTVnWPATT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKNLVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locution 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

(* nr*i*p c nrk ti H 1 nc 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=iVIethionine, 
IX /^apardgine, t — rroiine, Vc^jiuiaminc, iv— Argininc, o^oerine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosin"e, 
X=Unknown, '=Stop.codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 




- . 






GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS ' 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSeLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKBLTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNWAQLPKCRECRLDSLRKD 

KEQQKDSPWCRFFHFILRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTKNWGroLDTAKYILANI 

GDHFCQMVISEKEAMSTIEPHRQVAWKRAVKG 

VREMCDVCDTTEFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 

PTQIBPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASICPAGSMKPACPASTSPLN 

WLADLTSGNVNKENKEKQPTMPILKNEIKCLPPL 

PPLSKSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPKILDDIFASLVQNKTTSDLSKR 

PQGLTDCPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRXEFGEQEVDLVNCRTNEnXG ATVGDFWD 

GFEDWNRLKNEKEPKCVT.KLKDWPPGEDFRDM 

MPSRFDDLMANIPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMWAYGLITPEDRKYGTTNLHLDVS 

DAANVMVYVGIPKGQCEQEEEVLKTIQDGDSDE 

LTTKRFTFnKFKPfrAT WHTVAA1<rr>'m<rrRPTrT IfTcT 

VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 
EYGVQGWAIVQFLGDVVFIPAGAPHQVHNLYSC 
IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHThfHE 
DKLQVKNVIYHAVKDAVAMLKASESSFGKP 


3091.. 


A 


97 


1838 


KRGARRGGWKRKMPSTDLLMLKAJEPYLEELEV 
YSTKAKNYVNGHCTKYEPWQLIAWSWWTLLI 

. VWG YEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 
IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 
LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 
EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 
VRIACSLFNGGPDSCGCVTSGGTESILMACKAYR 
DLAFEKGKTPErVAPQSAHAAFNKAASYFGMKI 
VRWLTKMMEVDVRAMRRAISRNTAMLVCSTP 
QFPHGVmPVPEVAKLAVKYKIPLHVDACLGGFL 
IVFMEKAGYPLEHPFDFRVKGVTSrSADTHKYGY 
APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

. PTIAGSRPGGISAACWAALMHFGENGYVEATKQI 

lYRLSNLMTAKGWNLNQLQFPPSIHFCITLLHAR 
KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 
MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 
MNGSPKPH 


3092 


A 


79 


2652 


LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVRI 

VSRGRMPLFALNPRSGSLITARRIDREELCAQSM 

PCLVSFNILVEDKMKLFPVEVEIIDIND>FTPQFQL 
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SEQED 
NO: ■ 


Method 


Predicted 

beginning 

nucleotide 

loc&tion 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Add, F=Phenylalanine, G=Glydne, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
lS=AspaiTigine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vaiine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A^possible nucleotide deletion, 
possible nudeotide insertion 










EELEFKMNEirrPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLELTASDGGEPVRSGTLRIYIQWDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFH>ATDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVTITSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFffGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNnTTAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLffiDTIQGAPLSAYLSINSD 

TGVLYALRSFDYEQFRDMQLKVMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEBLYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

WAVQDHGQPPLSATVTLTVAVADRIPDILADLG 

^Lfi-tr{>J\r%TriLJoXJLi i L. I L* V V AllAA V Vr J_*Ar Vl V 

LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 
DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNYAD 
TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 
ECISYLEKNNS 


3093 . 


A 


1 


3868 


ppdnqklglleallkigdwqhaqnimdqmppyy 

aashklialaicklihitffiplyrsvtswavdhag 

flesdpcdstvghllsrvgvpkgakgspvnalq 

nkrapkqaesfedlrrdvfnmfcylgphlshdpi 

lfakvvrigksfmkefqsdgskqedkektevils 

cllsitdqvllpslslmdcnacmseelwgmfkt 

fpyqhryrlygqwknetynshpllvkvkaqtid 

rakyimkrltkenvkpsgrqigklshsnpttt.fd 

yvcfeilsqiqkydnlitpwdslkyltslnydvl 

acilsncnealanpekermkhddtnsswlqsla 

sfcgavfrkypidlagllqyvanqlkagbcsfdl 

lelkevvqkmagieiteemtmeqleamtggeql 

kaeggyfgqirntkkssqrlkdalldhdlalpl 

cllmaqqrngvifqeggekhlklvgklydqch 

dtlvqfggflasnlstedyikrvpsidvlcnefht 

phdaafflsrpmyahhisskydelkksekgskq 

qhkvhkyitscemvmapvheawslhvskvwd 

dispqfyatfwsltmydlavphtsyereynklk 

vqmkatodnqemppnkkkkekerctalqdkll 

eeekkqmehvqrvlqrlklekdnwllakstkn 

etitkflqlcifprcifsaidavycarfvelvhqq 

ktpnfstllcydrvfsdiiytvascteneasrygr 

flccmletvtrwhsdratyebcecgnypgfltil 

ratgfdggnkadqldyenfrhwhkwhyklt 

kasvhcletgeythirnilm.tkilpwypkvlnl 

gqalerrvhkicqeekekrpdlyalamgysgql 

ksrksymipenefhhkdppprnavasvqngpgg' 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

kekekekkektpattpearvlgkdgkekpkfer 

PNKDEKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLIELKESSAKLYINHTPP 
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SEQDD 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
. acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence - 


Amino acid sequence (A^=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Lcucine, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon,-^ossiblc nucleotide deletion, 
V=possible nucleotide insertion 










KERKRDHSNNDREVPPDLTBCRRKEENGTMG VSK 
HKSESPCESPYPKEKDKEKNKSKSSGKEKGSDSF 
KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 
KEEKKHHKSSDKHR 


3094 


A 


2 


891 


AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 

PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 

ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 

KYDKNSDGKIEMAELAQILPTEENFLLCFRQHVG 

SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 

KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 

SEMSRLLPVQENFLLKFQGMKLTSEEFNAIFTFY 

DKDRSGYIDEHELDALLKDLYEKNKKEMNIQQL 

TNYRKSVMSLAEAGKLYRKDLEIVLCSEPPM 


3095 


A 


1685 


700 


RRPTGRPGALGAPAAGRVGMPLHVKWPFPAVPP 

LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEBCRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 

LGKCVPVCRGAEFFQAENEGKG VLDTGRHMPG 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPIILPLW 

HVGMNDVLPNSPPYFPRFGQKITVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 


3096 


A 


6642 


4022. 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 
EAQPEWLRAiEVKRLSHELAETTREKIQAAEYGL 
AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 
GQAHTNHKKVAADGESREESLIQESASKEQYYV 
RXVLELQTELKQLRNVLTNTQSENERLASVAQE 
LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 
ELEEENISLQKQVSVLRQNQVEFEGLKHEKRLE 
EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 
EQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKF 
SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 
TPKKEGLAPPSPSLVSDLLSELNISEIQKLKQQLM 
QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 
VTRLTENLSALRRLQASKERQTALDNEKDRDSH 
EDGDYYEVDINGPEILACKYHVAVAEAGELREQ 
LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 
SLLEKASRQDRELLARLEKELKKVSDVAGETQG 
SLSYAQDELVTFSEELANLYHHVCMCNNETPNR . 
VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 
LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 
PRREPMNIYNLIAJIRDQIKHLQAAVDRTTELSRQ 
RIASQELGPAVDKDKEALMEEILKLKSLLSTKRE 
, QITTLRTVLKANKQTAEVALANLKSKYENEKAM 
V I ii 1 MMKI.RNELKALKED AATFSSLRAMFATRC 
DEYTTQLDEMQRQLAAAEDEICKTLNSLLRMAIQ 
QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 
TPSVSHTCACASDRAEGTGLANQVFCSEKHSIYC 
D • 


3097 


A 


1 


879 


MVKWPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWYQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

seqoence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
EXSIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidinc, 
I=Isoleucine, K==Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Onknown, *=Stop codon; /^possible nucleotide deletion, 
V^ossible nucleotide insertion 










UiUAvjKAV lLC^yurrNQuYKJK.GAEVlLNYGRLRG 
TLSALLSWCHLH>nWSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHWDLLDSIEDMDLCHVVPAE 
KKIDEAKDBRLCENNAEF>fKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 


3098 


A 


2 


505 


GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 
UKK VJLGJLKJl WOKi'AabKJiCSLCQRLKIlELNMGD 
VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 
FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 
ENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 

m 


3099 


A 


144 


1386 


WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRVVRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDRIIAKGSFTERDATRVLQMV 

LDGVRYLHALGrraRDLKPENLLYYHPGTDSKIII 

TDFGLASARKKGDDCLMKTTCGTPEYIAPEVLV 

RKPYTNSYDMWALGVLAYILLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWWSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 


3100 


A 


3 


1500 


ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLDIVLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLKTYLNKAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV . 

STGNYEYRLILRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPEEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 


.3101 


A 


1173 


197 


QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAIEAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAl^ffiKCKDAGLAKSIGVS 

N r NKKl^LbMiLN KPOLKYKP VCjS Q VECOT 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 


3102 


A 


144 


1098 


EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

WSLPACDHLHKNESVLKAKAWAFHRGNFREL . 
YKILESHQFSPHNHPKLQQLWLKAHYYEAEKLR 
GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 
EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=GI>'Cine, H=Histidine, 
I=Isalcucine, K-Lysine, L^Lcudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=nireonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=pos5ible nucleotide insertion 










TQVSNWICNRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 


3103 


A 


111 


1582 


LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKNKEDSNIKENSSGAGKTBCRAFD 

FSAHGRRHVALRIAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRmSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPEUDELLNIEKNPQKPQYSMAVEFPLVLY 

JL'L.JU'tN V A Wl y JLJyjiAyjbrJNl J nJLyyL WAW 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 
NVKPSVIKQTSAFVEGVKMRTYKPLMDRPKCQG 
LESRIQHFVRRGRIEHPHLFHEEETICAKRDCNDT 
LEEDNTNLETPTKRVCVDTEIKSII 


3104 


A 


227 


1519 ; 


VTLIKMNAMLETPELPAVFDGVKLAAVAAVLYV 
IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 
KEYIPPLIWGKSGHIQTALYGKMGRVRSPHPYGH 
RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 
GIANHSEKQYIRTFVDYAQKNGYRCAVLNHLG A 
LPNffiLTSPRMFTYGCTWEFGAMVNYIKKTYPLT ■ 
QLVWGFSLGGNTVCKYLGETQANQEKVLCCVS 
VCQGYSALRAQETFMQWDQCRRFYNFLMADN . 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 
HRIYVPLMLVNAADDPLVHESLLTIPKSLSEKRE 
NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 
VEYANAICQWERNKLQCSDTEQVEADLE 


3105 


A 


1 


1251 


MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYDCPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFRELRDTALTRRWVTKKIKVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPWPSAT 

GEPDGPEGEALPAACPEELAFEAEVEYNGGFHOLA 

IDVDLVFGKSAYLFVKLSRWGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSQFEGRPMPQLTSnVNQ 

LJ<JsJJ.i<J<Js±l 1 LrN 1 KlKr IsJrr r r Y 1 L yGr cbUbB 

HlfflQQWALTEGRLKVTLLECSRLLIFGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLt 

VPLRQCPG 


3106 


A 


972 


468 


MAAAGAGRLRRVASALLLRSPRLPARELSAPAR 

JjI rlJ<>js.V ¥0x1 1 cNrKiN VObJLlJKlbJU^ VCjIULVO 

APACGDVMKLQIQVDEKGKTVDARFKTFGCGSA 

lASSSLATEWVKGKTVEEALTIKNTDIAKELCLPP 

VKLHCSMLAEDAIKAALADYBXKQEPKKGEAE 

KK 


3107 


A 


106 


1221 


TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGryFLETSERMEPPHLVSCS 
VESAAKTYPEWPWFFMKGLTDSTPMPSNSTYPA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide^ 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, IMAspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, L=Leucine, M=MethioDine, 
N=Asparagine, P=Proline, Q=Glatamine, R=Arginine, S=Serine, 
T=Threonirie, V=Valine, W=Tr)'ptophan,y=TyrosiDe, 
X=Unknown, *=Stop codon, >=possible nucleotide deletion, 
\=possible nucleotide insertion 










fsflsaid>a'flfpldmkrlledtplfswynqina 

saernwlhissdasrlanwkyggiymdtdvisir 

pipeenfiaaqasryssngifgflphhpflwecme 

nfvehynsaiwgnqgpelmtrmlrvwckledf 

qevsdlrclnisflhpqrfypisyrewkryyevw 

DTEPSIW'SYALHLWNHMNQEGRAVIRGSNTLV 
ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 


3108 


A 


1612 


839 


evalfcfemaagmylehyldsienlpfelqrnfq 

lmrdldqrtedlkaeidklateymssarslssee 

klallkqiqeaygkckefgddkvqlamqtyem 

vdkhikrldtdlarfeadlkekqiessdydssss 

kgkkkgrtqkekkaararskgknsdeeapkta 

qkklklyrtspeygmpsvtfgsvhpsdvldmpv 

dpneptyclchqvsygemigcdnpdcsiewfhfa 

cvglttkprgkwfcprcsqerkkk 


3109 


A 


1 


2613 


mvavraagpregasqdeagtvwapmtgcpcqc 

rpgpswllvdtlepetaypvqrpgpeqagnqrl . 

qmkraqfgphdwlslpvppgpswllvdtlepet 

ayqfsvlaqnklgtsafsevvtvntlafpittpep. 

lvlvtpprclianrtqqgvllswlppanhsfpidr 

ymefrvaerwellddgipgtegeffakdlsqdt 

wyefrvlaVmqdlisepsniagvsstdifpqpdlt 

edglarpvlagivaticflaaailfstlaacfvnk 

qrkrklkrkkdpplsithcrkslesplssgkvspe 

sirtlrapsessddqgqpaakrmlsptrekelsl 

ykktkraisskkysvakaeaeaeattpielisrgp ■ 

dgrfvmdpaemepslksrriegfpfaeetdmype 

frqsdeenedplvptsvaalksqltplsssqesyl 

pppaysprfqprglegpgglegrlqatgqarppa 

prpfhhgqyygylsssspgevepppfyvpevgspl 

ssvmsspplptegpfghptipeengenasnstlplt 

qtptggrspepwgrpefpfggletpammfphqlp 

pcdvpeslqpkaglprglpptslqvpaaypgilsl 

eaprgwagkspgrgpvpappaakwqdrpmqpl 

vsqgqlrhtsqgmgipvlpypepaepgahggpst 

fgldtrwyepqprprpsprqarraepslhqwlq 

psrlspltqsplssrtgspelaararprpgllqqa 

emseitlqppaavsfsrkstpstgspsqssrsgsps 

yrpamgfttlatgypspppgpapagpgdsldvfg 

qtpsprrtgeellrpetppptlph,gklrrdrpap 

atspperalskl 


3110 


A 


88 


924 


ilgsrtmsltntktgfsvkdeldlpdtndeegsv 

aegpeeenegpepakragplgqgaldavqslpl 

bcnpfydssdnpytrwlasteglqyslhglaaga 

ppqdssskspepsadespdndketpggggdagkk 

rkrrvlfskaqtyelerrfrqqrylsaperehla 

slirltptqvkiwfqnhrykmkraraekgmevt 

plpsprrvavpvlvrdgkpchalkaqdlaaatf 

qagipfsaysaqslqhmqynaqyssastpqypt 

ArirLVv^Al^v^Wl W 


3111 


A 


595 


291 


gttlpggnqrelarqknmkkqsdsvkgkrrdd 
glsaaarkqrdstprdseimqqkqkkanekkee 

PK 


3112 


A 


3641 


1555 


apmlqihhfsfklifqnihkskfisqrlsqnadst 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to. last amino, 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glntamic Acid, F=Plien>'lalanine, G=Glycine, H=Hi5tidine, 
I^Isoleudnc, K=Lysine, L=Lcudne, M=Metfaionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tr}'ptophan, Y=Tyrosine, 
X=Unl!nown, *=Stop codon, /=passible nudeotide deletion, 
V=possibIe nucleotide insertion ' 


- 








RHThn.SNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLEPYPLITKEDINAffiMEEDKRDLISREISKFRDT 

HKKLEEEKGKKEKERQEIEKERRERERERERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE . 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKl-rWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGniPHENSPDQQ 

QPEEHRPKIGLSLKLGASNSPGQPNSVKRKKLPV 

i^o V r IN jsj" cwjalJ bJJ JJ V r KlvKJsJj V r 

TKGTVNTEEKRKHIKSLIEKIPTAKPELFAYPLDW 

SI\TDSILMERRIRPWINKKnEyiGEEEATLVDLVC 

SKVMAHSPPQSELDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 


3113 


A 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPGLET 
NILICMTTPNKTPPGADPKQLERTGTVREIGSQAV 

VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHNLQEIRQLELVEPSGWIHVPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 


3114 


A 


1 


1613 


MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELVVPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFnGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LKATVTPNRVKMLPYFGIIRjmiMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIRQSKQQKITQAIERLVEDLIQESMAKGDF 

FiMF QriTi^n'k^PT li'VTJCr^r'CVTrk'DAyfT'LrKTr xtott tt^xt*^ 
UiNUooivoiyr J_tJSJsx oUL^o I UJr M 1 JtliN LflNKlLIUiN (j 

YQPEWILKQKEISDTffiQLREAILVSRKKLGNPMT 
PTEKKQ\W>mvCEQFQENIRKLNKRINDFNLIVPI 
LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 
PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 


3115 


A 


1 


2036 


FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAIDGE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSBDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQI<KFIDQVVEKIEDLLQSEENKNLDL 

EPCTGFORKLrYOTLSWKYPKGIHVETLETEKKE 

RYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIANSGKLVIGHNMLLDVMHTVHQFYC 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

nNNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to ilrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino . 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaniae 0=C>'steine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H'^Histidine, 
I^Isoleucine, K=Lysine, Lf=Leucine, M''==Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=ArgininE, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=llnltnown, *=Stop codon, ^possible nucleotide deletion, 
\=passible nucleotide insertion 










ASEQLHEAGYDAYITGLCnSMANYLGSFLSPPKI 
HVSARSKLIEPFFNKLFLMRVMDIPYLNLEGPDL 
QPKRDHVLHVTFPKEWKTSDLYQLFSAFGNIQIS 

vvLLfiJ I o/VT V oL^oV^x^liV^ ViSJ-A VIN I oJvl Alio I JKJV^ 1 

YAEYMGRKQEEKQIKRKWTEDSVl'KEADSKKLN 
PQCBPYTLQNHYYRNNSFTAPSTVGKKNLSPSQE 
EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 
KKLKRMKKELSPAGSISKNSPATLFEVPDTW 


3116 


A 


3 


1443 


TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRJLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL 

ALRTTIDLTCSGHVSrFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGQWAIGYVSSDGSILQTIPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

Vt^Xrl V ollxlt^JLyi/ 1 WAMlJo IrtlLCJMCAlioNJsX' V 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 
WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 
VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 
AALGPQDPAPA - 


3117 


A 


296 


3547 


ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 

QSSirLCFSPVGRTLRVRARKFPAIVNCTAIDWFH 

AWPQEALVSVSRRFffiETKGEEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNYTTPKSFLEQISLF 

KNLLKKKQNEVSEBOCERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITKIGLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPALVAATAALNTLNRVNLSELKAFPNPPIAVT 

NVTAAVMVLLAPRGRVPKDRSWKAAKVFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLWSANYD 

lEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAWNNEGLPSDRMSTENAAIL 

THOERWPLVIDPQQQGIKWIKNKYGMDLKVTHL 

GQKGFLNAffiTALAFGDVILIENLEETIDPVLDPL 

LGKNTKKGKYIRIGDKECEFNKNFRLILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFKIELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKINEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRISI 

LMESITHAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEJDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFILSPGVDALKDLEILGBaiLGrnDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRDYRVFMSAESAPTPD 

EHIIPQGLLENSIKITNEPPTGMLANLHAALYNFD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
E=Glutnmic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=*rhreoninc, V— Valine, ^V— Tryptophan, y~Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possiblc nucleotide insertion 










Q 


3118 


A 


1 


226 


PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE ^ . 


3119 


A 


1254 


4133 


PLATLTMEEQGHSEMEDPSESHPfflQLLKSNREL 
LVTHIKNTQCLVDNLLKNDYFSAEDAEIVCACPT 
QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 
VDLRPWLLEIGFSPSLLTQSKVWNTDPVSRYTQ 
QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 
LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 
GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 
FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 
VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 
SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 
GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 
ALQDRLLSQLEANPNLCSLCSVPLFCWnFRCFQH 
FRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRM 
QPSSLVQKNTRSPVETLHAGRDTLCSLGQVAHR 
GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 
ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 
TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 
. SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 
RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 
VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 
AARGICANYLKLTYCNACSADCSALSFVLHHFP 
KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 
VNQITDGGVKVLSEELTKYKIVTYLGLYNNQITD 
VGARYVTKILDECKGLTHLKLGKNKITSEGGKY 
L.ALA V KJN blvalisb V (jM W ONC^ V OUbO AKAF AEA 
LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 
EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 
IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 
EEAKVYEDEKRHCF 


3120 


A 


43 


1004 


QLWGFAAGSDSRPAMGCDGGTIPKRHELVKGPK 

KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 

LGRLYNKDAVIEFLLDKSAEKALGKAASHIKSIK 

NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

FICPWGLEMNGRHRFCFLRCCGCVFSERALKEI 

KAJbVCH 1 UCjAAFQEDDVIVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP . 

GPSKVKTGKPEEASLDSREKKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 


3121 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNEmr^TNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

PNTPW^m^'KPFPTrRi^nnFT^Fii/rvT topticqwp 

^iNlr V V ovJJVCOIjljiLt^JvLrvJil i oJulvx i Lilv^x^JJoo Vi\-r 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 
DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 
ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 
Q>}EA2sIKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

SCQUCQCe 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 


Amino acid sequence (A=Alanioc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I^Isoleucine, K=Lj'sine, L^Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=ValiDC, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon,/'=pos5ible nucleotide deletion, 
V^possible nucleotide insertion 










NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGWWMbTVVTCGSWSMKKMSMKIRP 
FFPQQ 


3122 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 
SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 
PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 
HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 
NNNVEAVSQTSSSSFQYMYLLKDLWQBCRQKQV 
KDNENWNEYSSELEKHQLYIDETVNSNIPTNLR 
VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 
CNIPVVSGKECEEURKGGETSEMYLIQPDSSVKP 
YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 
DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 
ISQLTRMGPTELLJEMEDWKGDKVBCAHYGGETV 
JiAINK Y l^la V N Js. Y Ku 1 AON AJLMDG ASQLMGE 
NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKHGTDDGWWMNWKGSWYSMKKMSMKIRP 
FFPQQ 


3123 


A 


3 


1490 


HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKXREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV - 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

l^JN tiAlNK I IJIS V N Jv Y KLj 1 AuIN AJLJVLUO AoQLMOE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDGWWMNWKGSWYSMKKMSMKIRP 

FFPQQ 


3124 


A 


3 


544 


RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

CP CD TCOAyf A 130 A CD A t>*^Xynj A A DD T3 A n\ r A /^n'n A A 

oKatv 1 oKMArr AbKAl^QMKAAr KJ:^ Ar V AQrr AA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 
SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 
GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 
NEVLKQCRLANGLA 


3125 


A 


3 


.571 


GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 
AV 1 jvoruu i JJv^l^Js. 1 1 VulNulllKTNCjKOcLKJLRKrR 
TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 
LTQTQVKIWFQNKRSKFKKLLKQGSNPHESDPL . 
QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 
MPGYSHWYSSPHQDTMQRPQMM 


3126 . 


A 


43 


5377 


LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 
SAMSGRNELHSRLHPHPQSSLffMMFSPPESLLAS 
CELRGNFAEAHOVLFTFNLKSSPSSGFT MFMFRY 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 
LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPIPM 
LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 
LACSQCQLWKTCKQLLETAERRLNSSLERRGRRl 
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SEQID 
NO: 


Method 


Predicted 
. beginning 
nucleotide 
location 
corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=Alaninc C=Cysteine, D=Aspartic Acid, 
EM^Iutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, ' 
I=Isoleucine, K=Lysinc, L=Lencinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=To'ptophan, V=Tyrosine, 
X=l]nknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










dhvllnadgirgfpvvlqqiskslnyllmsasqt 

ksesveekgggpprcsitellqmcwpslsedcva 

shttlsqqldqvlqslrealelpeprtpplsslve 

qaaqkapeaeahpvqiqtqllqknlgkqtpsgs 

rqmdylgtffsycstlaavllqslssepdhvevk 

vgnpfvllqqsssqlvshllferqvpperlaall 

aqenlslsvpqvrvsccceplalcssrqsqqtssl 

ltrlgtlaqlhashclddlplstpssprttenptl 

erkpyssprdsslpaltssalaflksrskllatva 

clgasprlkvskpslswkelrgrrevplaaeqv 

arecerlleqfplfeafllaaweplrgslqqgqs 

lavnlcgwaslstvllglhspialdvlseafees 

lvardwsralqltbvygrdvddlssikdavlsc 

avacdkegwqylfpvkdaslrsrlalqfvdrw 

plescleilaycisdtavqeglkcelqrklaelq 

vyqkilglqsppVwcdwqtlrsccvedpstvmn 

mileaqeyelceewgclypprehlislhqkhll 

hllerrdhdkalqllrripdptmclevteqsldq 

htslatshflanyltthfygqltavrhreiqaly 

vgskilltlpeqhrasyshlssnplfmleqllmn 

mkvdwatvavqtlqqllvgqeigftmdevdsl 

lsryaekaldfpypqrekrsdsvihlqeivhqaa 

dpetlprspsaefspaappgissihspslrersfppt 

qpsqefvppatpparhqwvppetesicmvccreh 

ftmfnrrhhcrrcgrlvcsscstkkmwegcre 

nparvcdqcysycnkdvpeepsekpealdsskse 

sppysfwrvpkadevewildlkeeenelvrsef 

yyeqapsaslciailnlhrdsiacghqliehccrl 

skgltnpevdaglltdimkqllfsakmmfvkag 

qsqdlalcdsyiskvdvlnilvaaayrhvpsldq 

ilqpaavtrlrnqlleae yyqlg vevstktgldt 

tgawhawgmaclkagnltaarekfsrclkppf 

dlnqlnhgsrlvqdvveylestvrpfvslqddd 

yfatlreleatlrtqslslavipegkimnntyyq 

eclfylhnystnlansfyvrhsclreallhllnk 

esppevfiegifqpsyksgklhtlenllesidptles 

wgkyliaacqhlqkknyyhilyelqqfmkdqv 

raamtcirffshkaksytelgeklswllkakdh 

lkiylqetsrssgrkkttffrkkmtaadvsrhm 

ntlqlqmevtrflhrcesagtsqittlplptlfg 

nnhmkmdvackvmlggknvedgfgiafrvlq 

JJJrt^LUAAMl iORAARQLVEKEKYSEIQQLLKCV 

sesgmaaksdgdtillncleaficrippqccfcsa 
qelegliqaihnddnkvrayliccklrsayliav 
kqehsratalvqqvqqaakssgdawqdicaq 
wlltshprgahgpgsrk 


3127 


A 


467. 


1259 


hlgpplawipaasltstkgefgveddrpargppp ■ 

pkseeaswsesgvssssgdgpfaggevdkrlhql 

ktqlatltsslatvtqeksrmeasyladkkkmk 

C^UiJiUA6NJfLAfa££RARLEGELKGLQEQIAETKA 
RLITOOHDRAOFO'^n'HAT MT RFT 01<rT T OFFPTO 

rqdlelrleetrealagrayaaeqmegfelqtk 
qltreveelkselqairdeknqpdprlqelqeea 
arlkshfqaqlqqemrkvhhisfkhqplt 


3128 


A 


1854 


798 


asgspapssssamaaacgpgaagyclllglhlfl 
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SEQID 
NO: 


Metbod 


Predicted 

beginning- 

nucleotide 

lUCUllUli 

corresponding 
to first amino 
acid residue or 
peptide 
sequence 


Predicted end 
. nucleotide 
location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=A.lamne OCysteine, D=Aspartic Acid, 
E=Glutaraic Add, F=Phcnylaianine, G=GIycine, H^Histidine, 
I=lsoleucine, K=LysiDe, L=Leucine, M=Methioninc, 

A^pdrdginC) r — rroiinc, v^uiuiamiue, iv — Arginine, o^oennc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
V^possible nucleotide insertion 










LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTWSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLmVVLLGlA 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 
GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 
SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 
NSDTKTRTASGYGGTRRR 


3129 


A 


2340 . 


1192 


ELARRPKQQSSEKSR2>JMIRNWLT]FILFPLKLVEK 

CESSVSLTVPPWKLENGSSTOVSLTLRPPLNATL 

VITFEITFRSKNITILELPDEWVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISn 

NQVIG\\'IYFVAWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLVA^YIKEQFLLKY 

PNGVNPVNSNDVFFSLHAWLTLinVQCCLYERG 

GORV^WPATGFT VI AWT TTAFVTN/TTVA AVOVTTW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 
WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 
GDPTKFGLGVFSATDWFTIQHFCLYRBCRPGYD 
QLN 


3130 


A 


31 


2026 


CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

VVGGGGGTKAPKPSFVS YVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRVFQFCLRYTKEEEViaaVSGIIHHTQAP 

KLLKRLFLFSYATAAQNNTVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFWPTPLPEENVQRFQGHGIPrWCWSCHNGS 

ALLKMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 

EFWDTDKWFSLLESSSWLDIIRRCLKKAIEITEC 

MEAQlsWVLLLEENASDLCCLlSSLVQLMMDPH 

CRTRJGFQSLIQKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

l.GKRT'?KT,TNR'?T~)FT nri>JPl?T:FVn«?\xrTTRlir<?'mVH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 
ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 
APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 
KALGDEDDLAKREDEFVDLGDV 


3131 


A 


126 


.965 


QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 
WFRSPAITRYWFAATVAVPLVGKLGLISPAYLF 
LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 
NL'iTLYQySTRLETGAFDGRPADYLFMLLFNWl 
CTVTTGT.AMDIVIOT 1 TVITPT TM<\VT WW Am "KTRniwr 

rVSFWFGTRFKACYLPWVILGFNYUGGSVINELIG 
.NLVGHLYFFLMFRYPMDLGGRNFLSTPQFLYRW 
LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 
WGQGFRLGDQ 


3132 


A 


2 


350 


FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQIMHLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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SEQED 
NO: 


Method 


Predicted 
beginning 
DDcleotide 
-locfition 
' corresponding 
to first amino 
acid residue of 
peptide ^ 
sequence 


. Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence - 


Amino acid sequence (A=Alanine C=Cj'steine, U=Aspartic Acid, 
E=<;iutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Clutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /=^ossiblc nucleotide deletion, 
V=possibIe nucleotide insertion 


3133 


A 


1 


2921 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRAiSIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 

AHEQDTKMHEIYKGNITPQLNKNTTLKTSAATDV 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAELALLLHPVDQANTLKSPVSESVSPVVPDYLP 

TENGDFLSSKRKQISRDIhnfaRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

. FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 
KGNETIESIFKAEDLLPEAASLSENLDISKEETPPV 
RTLKSQSSLSGKPKERCPPNLAPLCYSYKNMKRS 

. SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 
NKKNSTTNYRGTAES VNAGANLQNYGETSPDAI 
STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 
DTEICLQVNQVTPDQLGNISLRHYLCNKPVGSDQ 
KAVfflSKSSPEISLRFESGPGAVIHSLLAEKNGFL 
QCHIENFSTEFLTSSLMNIQHFLEDETVATVMPM 

t^^T^^^7C'KT'T^f TXTT 1/"TM^CT>T>CCT^/0T T^TJ a T5\ /TT rr ttt\t tt 

IS-iy VaJN lJsJJNLlUJlJprl<j5bl VaLbrArV 1 VHIDHL 

VVERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHIKKMTVE 


3134 


A 


9 


1579 


EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEGCKVGRGTYGHVYKARRKDG 

KDEKEYALKQDEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHUKFH 

RASKANKKPMQLPRSMVBCSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF 

ARLFNSPLKPLADLDPVWTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLIKYMEKHKVKPDSKVFLL 

LQKLLTMDPTKRITSEQALQDPYFQEDPLPTLDV 

r AuL-v^Lf I r JSjCtSr ljJNJbJJJJrllJlls..tjL'lsjN VnJV VVf^ V 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 
AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 
SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 
QSTLGYSSSSQQSSQYHPSHQAHRY 


3135 


A 


3 


1111 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ ~ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKTTCr) 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

KSGKEKPKTNIEDLQIKKVKKKKKKKHKENEICR 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

'f*ni*r*PCnnnH!ncy 

to Tirst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

locf ominn 

IV lildl (IIIIIIIU 

acid residue of 

peptide 

sequence 


Amino add sequence (A=Alanine C-NDysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc R=Arginine, S=Serine, 

1 ~ 1 iircuiiinCj V — valine, tv— i r j jnujju*iuj j — i yr usinCj 

X=lInknown, *=Stop codon, possible nucleotide deletion, 
V°possible nucleotide insertion 










NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 


3136 


A 

1 


1442 


682 


TAAMSIFTPTNQIRLTNVAVVRMKRAGKRFEIAC 

YKNKWGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERirrQLEQMFRDIATIVADKCVNPETKKPYW 

LIERAMKDIHYSVKtNKSTKQQALEVIKQLKEK . 

MKffiRAHMRLRFILPVNEGKKLKEKLB[PLnCVIES 

EDYGQQLErVCLIDPGCFREIDELIKKETKGKGSL 

EVLNLKDVEEGDEKFE 


3137 


A 


1 


3143 


MVEGKRHVLHGGRQERMRAKQKGKPLIKSSDL 

VRLfflYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWVNGVKPGWQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGWRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIRIGFPSTSPAKA 

KkTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKEIALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKAYQAEVDBCLRAANEKYAQEVAGLBCDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRH\VTIAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTHMffiSNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 


3138 


A 


110 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAHEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

AC\r\/rMVT T T "DTXITTM inrVTr/^CD \/T UrM TT /^"XyTT UT 

Aa V VlJlisJ^LLKl WJJJLrr Y fcUoKVl^rt^L IJLuMLrlL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 
MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 
GTLTNLSQWRRJRTQRRKSTITALLFGEDDLEAL 
KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 
■ corresponding 

to first amino 

acid residue of 
. peptide 

sequence 


Predicted end 
nucleotide 
location 
. corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutaraic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L^Leucine, M==MethiODine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Trj'ptophan, Y=Tyrosinc 
X=Unknown, *'=Stop codon, A=possible nucleotide deletion, 
\Fpossible nucleotide insertion 










WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDPIENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDirnVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 
GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 
PGWVQKCELRVLCCFAFSLSQDWELPAKREAQ 
QPLKEGVRDMLVKHHLFSWDVDG 


3139 


A 


1 10. 


2499 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEDLAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDEHAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIBEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDIKLLLRIWDLFFYEGSRVLFQLTLGMLHL . 

KEEELIQSENSASIFNIXSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLUDQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

WSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDUTTVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFffiE 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 
GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 
PGWVQIKCELRVLCCFAFSLSQDWELPAICREAQ 
QPLKEGVRDMLVKHHLFSWDVDG 


3140 


A 


1 


4939 


SAALGASLAIPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDEIVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENDLYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQECLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGOSPRHHLPOPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASrrWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTRRSDRFATTLKNEIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=PheDyIalanine, G=Glycine, H=Histidine, 
I=Is6leucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=linknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










AGTYKDHLKEAQARVLRATSFKRRDLDPNPGDL - 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRJRKFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQJCAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRD YRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRJBRVMDNNTTVKMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEICDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

rVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKQELffiSISRKLQVLREARESLLE 

DVQANl VLOAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKWNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRERIVF 

DTLANYLSEESLADYEHFVKMKSALIIEQRELED 

KIHLGEEQLKCLLDSLQPERGK 


3141 


A 


97 


1894 


SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRWLAACSPYFHAMFTGEMSESR 

AKRVRKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEAL\na^SSACKNYLffiAMKYHLLPTEQRILMK 

SVRTRLRTPMNLPKLMVWGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRWTWJSYDPVKDQWTSVANMRDRR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGVVGGLLYAVGGYD 

CjAbRvJYLS 1 VbCYNA rrNEWTYlAEMSTRRSGA 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

AWK(s^VAJJiVlrvM»^KKi\AuvCAVNljLLY V V 

DGSCra.ASVEYYNPTTDKWTVVSSCMSTGRSYA 

GVTVIDKPL ■ 


3142 


A 


1211 


1311 


FSNLTTEKVAHAKEENLSMHQMLDQTLLELNN 
M 


3143 


A 


1809 


1041 


SEELDREKKLKEDSPRKTP>JKESGVPSLPVSLTSI 

DSRGTRVAVSSPMSQHQSYJQYLHAYPYPQMYD 
PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 
MSGREETEKV>rrSPSVNTKTTTESKALDLLQQH 
ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
loc&tion 
• corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=<;iutaniic Add, F=Phenylalanine, G=Glycine, H=Uistidine, 
l=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q==Glutamine, R=Argininc, S=Scrinc, 
T=Tlireonine, V=Valine, W=Tryptopban, y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
V=possible nucleotide insertion 










QRHLHTHHHTHVGMGYPLPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


3144 


A 


78 


604 


' SVSGIVLDLLPYLHFLSNMNLDGSAQDPEKREYS 
o vo V OAjDiJi-'LrkJ'fc.ijiiivxVi 1 A V vniJssji v vit i tiJSXxc, 
YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 
HKYKITLATGEGLYQSINPKDPS AKPKWCSKGIK 
QRIHTVTVDNGNIYVTLSNEPFKCDSDFYATGDF 


3145 


A 


2 


333 


RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 


3146 


A 


3 


1151 


VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVGIAVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FPHQVLAGLITGAVLGWLMTPRVPMERELSFYG . 

T TAT AT \/n nXQT TV^l/TT ThTT ni TW CA17CTCT ATTT/'M/ 

CERPEWIHVDSRPFASLSRDSGAALGLGIALHSPC. 
YAQYRRAQLGNGQKIACLVLAMGLLGPLDWLG 
HPPQISLFYEFNFLKYTLWPCLVLALVPWAVHMF 
SAQEAPPiHSS 


3147 . 


A 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 
ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGG AA 
GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 
DEECGTDEYCASPTRGGDAGVQICLACRKRRKR 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 
VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 
TKHRRICGSHGLEIFQRCYCGEGLSCRIQKDHHQ 
ASNSSRLHTCQRH 


3148 


A 


1 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLH 

TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLAIGKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEIFSDWFHHIFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRRDFMKNFINPPVP . 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKLWPSVAFAEGSSSEEELEAECFPVKPHNKSF 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 
GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 
QLRALRAVFRSQQQVRRRRGALGAVVKVEALQ 
EGPGGCGATAQSPLPCSSTAGDN 


3149 


A 


132 


4125 


VAVMISTAPLYSGVHNWTSSDRIRMCGINEERRA 

PLSDEESTTGDCOHFGSOEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFELATLGTGVPVEGTLPLVTTNFSP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino acid sequence (A>=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L=JLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Clutaraine, R=Argininc, S=Serinc, 
T=Threoninc, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAIPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLH.PVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSKSNRQKLPLPNQRJK.TPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFSapPEIVRNGDPSTWVKNSTALISTIPG 

TYVGV ANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPRGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG 

QLTPSQGAPIRPTSWSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCRIAAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPmPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDWFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAWRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLKNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 
ARRLIVNKNAGETLLQRAARLGYKDWLYCLQK 
DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 
HGA 


3150 


A 


3 


2795 


SLRMHNLSILVRQIKFYYQETLQQLIMMSLPNVLI 

IGBCNPFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 

lERIQGLDFDTKAAVAAKQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKNMALHLKRLroERDEH 

SETIIELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAKIRRLRQELEEKTEQLLD 

CKQELEQMEEELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVEELKEDNQVLLEIXTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTYDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

DETLRENSERQIKILEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKILHESIKETSSKLSKiEFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEKIEALEQENSELERENRKLKKTLDS 

FICNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEKEQLKKGLELLKASF 

KKTERLEVSYQGLDffiNQRLQKTLENSNKKJQQL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc OCysteine, D=Aspartic Acid, 
EX^Intamic Add, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^^ossible nucleotide deletion, 
V^possible nucleotide insertion 










ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 

NKSLEQETSQLEKDKKQLEKENKRLRQQAEIKD 

ri J^ENhWKlQNLEKJENKTLSKJEIGIYK£SC VRLE 

ELEKENKELVKRATEDIKTLVTLREDLVSEKLKT 

QQMIsThTOLEKLTHELEKIGLNKERLLHDEQSTDD 

SRYKLLESKLESTLKKSLEEKEEKIAALEARLEES 

TNYNQQLRQELKTVKKK 


3151 


A- 


2 


2515 


. GFWLHLTTLLGASLPAALGWMDPGTSRGPDVGV 
GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 
SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 
GRFYENHCKLHRAACLLGBCRJTVIHSBUDCFLKGD 
TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 
ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 
KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 
YMAFQVVQLSLAPEDRVSVTTVTVGLSTVLTCA 
VHGDLRPPnWKRNGLTLNFLDLEDINDFGEDDS 
LYITKVTTIHMGNYTCHASGHEQLFQTHVLQVN 
VPPVmVYPESQAQEPGV AASLRCHAEGIPMPRIT 
WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 
DTGAYTCIAKNEVGVDEDISSLnEDSARKtLANI 
LWREEGLSVGNMFYVFSDDGITVIHPVDCEIQRH 
LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 
RNRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 
PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 
ASTGQSQHLERTPFAGVDDFFIPPTNLIINHIRFGFr 
FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 
MAHTHLGGYFFIQCRQDSPASAARQLLVDSVTD 
SVLGPNGDVTGTPHTsPDGRFrvSAAADSPWLHV 
QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 
YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 
GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 
NGRQNTLRCEVSGIKGGTTVVWVGEV 


3152 


A 


1 


2645 


GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA 

SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAVWLSL 

ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 

SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYWTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLVVQRWDPSLTREALGHWLGLLNADGWIGRE 

QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 

MLEVGDPDDLAFLRKALPRLHAWFSWLHQSQA 

fiW PI QVTJWPODT^PAT PTT T WPIi'Tr PC(^T r\r\VPP 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 
AEVAAELGPLAASLEAAESLDELHWAPELGVFA 
DFGNHTKAVQLKPRPPQGLVRWGRPQPQLQYV 
DALGYVSLFPLLLRLLDPTSSRLGPLLDILADSRH 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding . 
to first amino 
acid residue of 
peptide 
scQuence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residoe of 
peptide 
sequence 


Amino acid sequence (A<=AIanine C=Cystcinc, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M==Metbionine, 
N=Asparagine,P^Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=pos5ible nucleotide deletion, 
V=possible nucleotide insertion 










LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQARAAKLHGE 
LRANWGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 


3153 


A 


1 


4312 


MVIKTDELPAAAPADSAREHGSQAGGKGRPGAA 

AVLLADLERDARQGECALPGAAlvtAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVrVIQDTGFSVKILAPGffiPFSLQVSPQEMVQEIH 

qvlmdredtchrtcfslhldgnvldhfselrsv 

eglqegsvlrweepytvrearihvrhvrdllks 

ldpsdafngvdcnslsflsvftdgdlgdsgkrk 

kglemdpidctppeyilpgsrerplcplqpqnrd 

wkplqclkvltmsgwnpppgnrkmhgdlmylf 

vitaedrqvsitastrgfylnqstayhfnpkpasp 

rflshslvellnqisptfkknfavlqkkrvqrhp 

feriatpfqvyswtapqaehamdcvraedayts 

rlgyeefflpgqtrdwneelqttrelprknlperl 

lreraifkvhsdftaaatrgamavidgnvmain 

pseetkmqmfiwnniffslgfdvrdhykdfggd 

vaayvaptndlngvrtynavdveglytlgtvv 

vdyrgyrvtaqsdpgilerdqeqsviygsrofgk 

twshprylellertsrplkilrhqvlndrdeev 

elcssveckgngndgrhyildllrtfppdlnflp 

vpgeelpeecaragfprahrhklcclrqelvda 

fvehryllfmklaalqlmqqnasqletpsslen 

ggpsslesksedppgqeagseeegssasglakvk 

elaetiaaddgtdprsrevirnackavgsisstaf 

dirfnpdifspgvrfpescqdevrdqkqllkdaa 

afllscqipglvkdcmehavlpvdgatlaevmr 

qrginmrylgkvlelvlrsparhqldhvfkigig 

elitrsakhifktylqgvelsglsaaishflncfls 

sypnpvahlpadelvskkrnkrrknrppgaadn 

tawavmtpqelwknicqeaknyfdfdlecetv 

dqavetyglqkitllreislktgiqvllkeysfds 

rhkpafteedvlnifpvvkhvnpkasdafhffqs 

gqakvqqgflbcegcelinealnlfnnvygamh 

vetcaclrllarlhyimgdyaealsnqqkavl 

mservmgtehpntiqeymhlalycfassqlsta 

lsllyrarylmllvfgedhpemalldnniglvl 

hgvmeydlslilflenalavstkyhgpkalkval 

gedhektk:esseylkcltqqavalqrtmneiyr 
ngssanipplkftapsmasvleqlnvingilfipls 
qkdlenlkaevarrhqlqeasrnrdraeepma 
tepapagapgdlgsqppaakdpspsvqg 


3154 


A 


416 


4082 


kfklikimlltliillpvvskfsfvslsapqhwscp 
egtlagngnstcvgpapflifshgnsifridtegt 
nyeqlvvdagvsvimdfhynekriywvdlerq 

iwsnqqegntvtdmkgnnshillsalkypanva 
vdpverfbrwssevagslyradldgvgvkalle 
tsekrravsldvldkrlfwiqynregsnslicscd 
ydggsvhiskhptqhnlfamslfgdrifystwk ■ 
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SEQm 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
•corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^^Histidine, 
I'^Isoleucine, K=Lysine, L=Lcucine, M^Methianinc, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










MKTIWUNKEH'GKDMVRINLHSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSrVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDC\a,TSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEBCSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENKIYFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKnXIENISQPRGIAVHPMAKRLFWTDTGINPRIE 

SSSLQGLGRLVIASSDLIWPSGITIDFLTDKLYWC 

DAKQSVBEMANLDGSKRKRLTQNDVGHPFAVA ' 

VFEDYVWFSDWAMPSVIRVNKRTGBaDRVRLQG 

SMLKPSSLVVVHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARaSEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSYRNSDSECPLSHDGYCLHDGV 

CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 
GQAADGSMQPTSWRQEPQLCGMGTEQGGWIPV 
SSDKGSCPQV^4ERSFHMPSYGTQTLEGGVEKPH 
SI I SANPI WOOR AT TIPPHOMPT TO 


3155 


A 


533 


212 


GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 
VPPEEECEF . 


3156 


A 


2 


1585 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGFAERRIDKFGFIVGSQGAEGALEEVPLEVLRQ 

RESKWLDMLNNWDKWMAKKHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGEIL 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVLRVWDMFFCEGVKIIFRVGLVLLK 

HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 
APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEB3> 
PAPNQAMWAAAGDACPPQHVPPKDSAPKDSAP 
QDLAPQVSAHHRSQESLTSQESEDTYL 


3157 


A 


3 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSQASLLRLH 

hrfraldrnkkgylsrmdlooigalavntpi gdr 

DESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EDGDGAVSFyEFTKSLEKMDVEHKMSIRILK 
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SEQD) 
NO: 


Method 


Predicted 
beginning 
nucleotide 
locstion 
' corresponding 
to first amino 
acid residue of 
peptide 

aCiJ UCJICC 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cj'stcinc, D=Aspartic Acid, 
E=Glutaniic Acid, F°=PhciiylaIanine, <>=Glycine, H=Histidine, 
I=Isolcucine, K=Lysine, l^Leucine, M=Metliloaine, 
N=Asparagine, P=ProIine, Q=Glutainine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Trj'ptophan, y=Tyrosine, 
X<=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 


3158 


A 


2 


409 


ISSCPHTAYEGSMSTLSNFTQTLEDVFRRIFITYM 
DNWRQNTTAEQEALQAKVDAENFYYVILYLMV 
MIGMFSFIIVAILVSTVKSKEJREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKATIHENIGAAGFKMSP 


3159 


A 


3 




PWfiA AFT nMnRpriAnT t a at i VT m at Ar;cx: 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGT55PFFCASRKPrF<?'NrFTFFVPWnn?P1(r<;\rPnP 
X X \jxtjx A-iijf\^^\ti7Jvivv^v^i^ oiNx^ xr Ct V X w VrfP r i v rz-l J K.^ 

HY 


3160 


A 


179 


409 


KPKTKILKMVYYPELFVWVSQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 


3161 


A 


683 


1186 


LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 
KDVKDGKYSQVLANGLDNKLREDLERLKKIRA 
HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 
K 


.3162 


A 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEADIMIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPGIGGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQ'i'METHNFKLPQDDLQGIQKlYGPPAE 

PLEPTRPLPTLPVRRHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRlsiNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRPVFFKGDKYWVFKBVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

VQPPPT? A TTiPnVPT(fPTT\/'U7V nTOr\ A Di^n A ■CTCIi^'C 
I oJCiClSJS-rv 1 JL'JrO I riSJ^l 1 V W fwOiJrV^ Ar V^OArlorkJi 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRNILRD 
WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 
GSVNAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 
PVTYYKRPVQEWV 


3163 


A 


1235 


2223 . . 


SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SK^PYNG VRif r> <?R>J^R <? A <;r <;r <;r tr ir ir q 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 
HHhfflGSPHLKAKlITRDDLKSSl^GHkRKKSRS 
RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 
ERSHKSKHHGGSRSGHGRHRR 


3164 


A 


3 • 


3274 


DCRLQAAMPTNFTWPVEAHADGGGDETAERT 

EAPGTPEGPEPERPSPGDGNPRENSPFLKINVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGWEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWrVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGWPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide . 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glulamic Add, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K^Lysine, L=Lcucine, M^Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threoninc, V=Valine, W=Trj'ptophan, V=Tyrosine, 
X=UnliD0wn, *<^top codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










GAAIFQAEAAGGEAAAMLHhJMRVYGTCTLVLM 

ALVVFVG\nK:YVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

KNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GMAGSNRSGDLKDAQKSIPTGTILAIVTTSnYLS 

CIVLFGACIEGWLRDKFGEALQGNLVIGMLAW 

PSPWVTVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPILSMFFLMCYLFVNLACAVQTLLRTP 

^IWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTTVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASAVKQED 

OTFSWK>nFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHEDVWWIVHDGGMLMLLFFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMKKDLQMFLY 

HLRISAEVEVVEMVENDISAFTYERTLMMEQRS 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 
SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 
SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 
GLNRVLLVRGGGREVITIYS 


3165 


A 


3- 


2681 


GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESrVQEEKKKLTPEGNKGV 

TGSGFPFDFGKNPYKGKRPLKDUGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IIKLSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWENYFASTVNREMMCSPEL 

KNLIRAGIPHEHRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

HYSCPTSEGIQKLRNVLLAFSWRNPDIGYCQGLN 

■RTVAVATT VT 'BnPriATrWPT VTr\AR\n7AADPr»WT 
ivL/ V A V /\l^Lt J IjJlV^JDU/ir W V 1 1 V Jl VriVir KJJ III 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 
DYTLITFNWFLWFVDSVVSDILFKIWDSFt YEGP 
KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 
RTILDARSGTDAPTTWRKSGWS 


3166 


A 


10 


4070 


FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=<;ysteine, D=Aspartic Add. 
E=Glutamic Acid, F=PhenylaIanine, G=Glycinc, H=FBstidine, 
I=lsoleucine, K=Lysine, L=Leucine, l\{=Metbionine, 
N=Asparaginc, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T^Thrconine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 










TRAFADLLVERQTGQQDSDPYSPVnDQILEMVN 
GQRGLVLYYSLAAGYLYSWLLAPGAGIVKFHEH 
YLGENTVENSSDFQASSSVTLPTATGSALEQHIAS 
VREALGVESHYSRACASSETESEAGDIMDQQFEE 
M>INKLNSVTDPTGFLRMVRRNNLFNRSCQSMTS 
LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 
lAPMEGGLMHSSGPVGRHRQLILVLEGELYLIPF 
ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 
RKNPPTYSSSTSMAAVIGNPKLPSAVMDRWLWG 
PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 
ALTQAECVHFATHISWKLSALVLTPSMDGNPASS 
KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 
LLTAADVLDLQLPVKLWLGSSQESNSKVAADG 
VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 
AFYSSLLNGLKASAALGEAMKVVQSSKAFSHPS 
NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 
ARDALRVLLHLVEKSLQRIQNGQKNAMYTSQQS 
VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 
FFPTSDPGDRLQQCSSTLQSLLGLPNP ALQ ALCK 
LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 
EKEQDLASAPIQVSISVQLWRLPGCHEFLAALGF 
VLCEVGQEEVILKTGKQANRRTVHFALQSLLSLF 
DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 
QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 
GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 
: PQTRPAGNKDEBEYEGFSIISNEPLATYQENRNTC 
FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 
MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 
VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 
RSGGQ VSKSNNPEDG VQAPSSTA VFRA SETS AFS 
RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 
SPTTSEMSIKDSPSQHSGRPSPGCDSQTSQLDQPL . 
FKLKYPSSPYSAHISKSPRNMSPSSGHQSPAGSAP 
SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 
QAVHNLKMFWQSTFQHSTGPMJOFRGAPGTMTS 
KKDVLSLLNLSPRPNKKEEGVDKLELBCELSLQQH 
DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 
ARPLRLPSGNGYKFLSPGRFFPSSKC 


3167 


A 


1 . 


762 


AARRRQKGKEENMMMDLFETGSYFFYLDGENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLrWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFS YRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEWEK 


J loo 


A 

A 


701 


246 


TSRRVTMKJT>JPFVTSDR5K>JRKRHFNAPSHVRR 
Knv«SPLSKELRQKYNVRSMPIRKDDEVQWRG 
HYKGQQIGKWQVYRKKYVIYIERVQREKANGT 
TVHVGIHPSKWITRLKLDKDRKKILERKAKSRQ 


3169 


A 


156 


3168 . 


GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGVWFGGFMWSAIGIFLVSTFSMKETSYEEA 
LANQRIOEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glydne, H=Histidine, 
I^Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=ArginiDe, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=StDp codon, ^=possible nucleotide deletion, 
\FpossibIe nudeotide insertion ' 




- 






AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 
KKKEKXVAKVEPAVSSWNSIQVLTSKAAILETA 
PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 
PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEr 
LSEKAGHQDTWHKATQKGDPVAILKRQLEEKEK 
LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 
GEAKVKKQLVAREQEITAVQARMQASYREHVK 
EVQQLQGKIRTLQEQLENGPNTQLARLQQENSE- 
RDALNQATSQVESKQNAELAKLRQELSKVSKEL 
VEKSEA VRQDEQQRKALEAKAAAFEKQ VLQLQ 
ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 
AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 
GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 
AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 
ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 
ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 
. LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 
AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 
EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS . 
RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 
LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 
DAAKSEAQKQSDELALVRQQLSEMKSHVEDGDI 
AuArAbbPhAPPAEQDPVQLKTQLEWTEAILEDE 
QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 
ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 
LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 
EGTSV . ■ 


3170 


A , 


6730 


4027 


THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPffiEKAVTPSPEQVFAECSQKRILGLL 

AAMLPPLKSGPTVPLIDLEHVLPLMFQWISNAG 

HLNETYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 

KHNLFAAGDSSIVPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEILRSLNSAPLWRDVIATFTDH 

CnCQLPFQLKHTNlFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNVimiaiFTEKNRAVIVDVKTRKR 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPLHILRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 

QKLISSTESELQQSYAKQRRSKSAALLHKELNCK 

bKJ<AVRDYLFRVNEATAVLYARHVLASLLAEWP 

SHVPVSEDILFT ^nPAWMTYTT DN/TRMOT PFIfHF 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 
TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 
PERDFQLNQKALSPSSQFPSAEILRHIR 


3171 


A 


557 


89 


GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaoine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I=l50leucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glntaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tr>'ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /==possible nucleotide deletion, 
V=possible nucleotide insertion 










QALARFYCYTERTIAKRLVLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRPLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHSISDRLPEEKMQTQGSSNQ 




A 


J. 




rKivAOAOKuKKKOr. V 1 orL.ortrL.Arl^C)LA loKR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKHA 
LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 
PAYPINIVALVFSIMSLNSYNDGDYEGARRLGKN 
AKWVAIASIIIGLLnGISCAVHFTKNA 


3173 


A 


2 


4048 


FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASAIALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPWE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

APEQPSFVSPPDSLVGQHEENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKILAANPEAKSTSAILIENMDLYMLNPCSTKI 

WFVIELCEPIQVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFGPLSLIRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAELNMVNIAANILGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQIFCSELTnCCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNIEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT. 

LSQSLLLDITPEINPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEEDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDVNSMQIFTKLSETrVPPINTATVPDN 

EDGEAKMNIADTAKQTLISWDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQICESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTIVKLQNTSRIAE 

EQDQRQTEAIQLLQAQLTNMTQLVSNLSATVAE 

LKKEVSDRQSYLVISLVLCWLGLMLCMQRCRN 

TSQFDGDYISKLPKSNQYPSPKRCFSSYDDMJJLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLJCFSP 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 
EESYFCGISACTSLCNGQSQKTKTEkRALKRRRS 
KVQDQGKLDCTLIQTKSGSLPSLHDIIKGNKEITV 
GTFGVTAVSGHI 


3174 


A 


485 


4668 


RKCSKEKASKTPSQKIPTTPCCVLQAGPEPRSLAE 

EVTTERVQRQSVEEEGGIANYNTSSKEQPWFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQID 
NO: 


Method 


rrcdicted 
beginning 
nucleotide 
IbCiition 
' corresponding 
to Tirst iamino 
add residue of 
peptide 
sequence 


Predicted end 
' nucleotide 
locati'OD 
corresponding 
to last amino 
acid residue of - 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc D=Aspartic Acid, 
E"=Glotamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
]=Isoleucine, K=Lysine, L^Leucine, M=MethioniDe, 
N=Asparaginc,P=ProIine, Q^GIutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=TjTosine, 
X=Unknown, *=Stop codon, A=possible'nudeotide deletion, 
\=possible nudeotide insertion 










DYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYTVNWALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKWYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLrWTKASGPID 

HYRITFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YnSVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV. 

ASFDYYRVSYRPTQVGRLDSSWPNTVTEFTITR 

LNPATEYEISLNSVRGREESERICTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGETILVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKELIVDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

WADYRVGFGhA^DEFWLGLDNIHRITSQGRYEL 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 
MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 
WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 
SLQF 


3175 


A 


2 


623 


RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 
AATAEGTMASGVTVNDEVIKVFNDMKVRKSST 

v^JiCJJ^J^JvISJSJ\ V L.r ^l^oUUiSJsXliX V i^ilAJs.V^lL V \jUL 

GDTVEDPYTSFVKLLPLNDCRYALYDATYETKE 
SKKEDLVFIFWAPESAPLKSKMIYASSKDAIKKK 
FTGIKHEWQVNGLDDrKDRSTLGEKLGGNVWS 
LEGKPL 


3176 


A 


99 


1567 


PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDhfLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSrFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

lAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK 

LTKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKKLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location, 
corresponding 
to first amino 
acid residue of 
peptide . 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A<=AJanine C=Cysteine, D=Aspar(ic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=GIycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, M=Me(hiooine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc V=Valinc, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
\==possible nucleotide insertion 










LDKVTMQQVARTVAKVELSDHVCDWFALFDC 
DGNGELSNKEFVSIMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 


3177 


A 




OHO 


CUV VOooAA VOOK<,^AAXOAALOKKrMAAVLG 

ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 
FGFMSRVALQAEJCMNHHPEWFNVYNKVQITLTS 
HDCGELtKKDVKLAKFIEKAAASV 


3178 


A 


8 


612 


ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

l^isJu'rll^J^rl 1 r KAr 1 rKLKJLljAHKvjvjoVjJlLLbN 1 M 

EAMENSMAQRSDLLELDCQLTRDRWWSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIREIAGLVRRYDRNErnWASEKSSVMKKCK 


3179 


A 


88 


1496 


QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN . 
LTKNDLYPNPKPE\a,HMIYMRALQIVYGIRLEHF 
YMMPVNSEVMYPHLMEGFLPFSNLVTPILDSFLPI 
CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 
RETYMEFLWQYKSSADKMQQLNAAHQEALMK 
LERLDS VPVEEQEEFKQLSDGIQELQQSLNQDFPf ' 
QKTrVLQEGNSQKKSNISEKTKRLNELKLSWSL 
mQESLKTKIVDSPEKLKhPj'KEKMKDTVQKLK 
NARQEWEKYEIYGbSVDCLPSCQLEVQLYQKK 
Iv^ULblJNKJiKJjAolLKtbLNLEDQIESDESELKKL 
KTEENSFKRLMTVKKEKLATAQFKINKKHEDVK 
QYKRTVIEDCNKVQEKRGAVYERVTTINHEIQKI 
- RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 
GIEKAAEDS YAKIDEKTAELKRKMFKMST . 


3180 


A 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGIICNANNPCFRYPTPGEAPGWGNFNK 

SIVARLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPELRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRTVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYC^roLMKNLESSPL 

SRITWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPEDVQSSNGSVYTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDW 

EQAHRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMRJMGLDNSILWFSWnSSLIPLLVSAGLLWI 

LKLGNLLPYSDPSWFVFLSVFAVVTILQCFLIST 

ijsranlaaacggnyftlylpyvlrvawodyv 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRJSEICMEEEPTHLKLGVSIQNLVKVY 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location . 
corresponding 
to Tirst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino, acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Rhcnylalaninc, G=Glycine, H=Histidine, 
I=Isolcncine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










rdgmkvavdglalnfyegqitsflghngagktt 

tmsiltglfpptsgtayilgkdirsemstirqnlg 

vcpqhnvlfdmltveehiwfyarlkglsekhvk 

aemeqmaldvglpssklksktsqlsggmqrkls 

valafvggskWildeptagvdpysrrgiwelll 

kyrqgrthlsthhmdeadvlgdrianshgklcc 

vgsslflknqlgtgyyltlvkkdvesslsscrns 

sstvsylkkedsvsqsssdaglgsdhesdtltid 

vsaisnlirkhvsearlvedigheltyvlpyeaa 

kegafvelfheiddrlsdlgissygisettleeifl 

kvaeesgvdaetsdgtlparrnrrafgdkqscl 

rpfteddaadpndsdidpesretdllsgmdgkgs 

yqvkgwkltqqqfvallwkrlliarrsrkgff 

aqivlpavfvcialvfslrvppfgkypslelqpwm 

yneqytfvsndapedtgtlellnaltkdpgfgt 

rcmegnpipdtpcqageeewttapvpqtimdlfq 

ngnwtmqnpspacqcssdkkkmlpvcppgagg 

lpppqrkqntadilqdltgrnisdylvktyvqna 

kslknkiwvnefryggfslgvsntqalppsqev 

ndatkqmkkhlklakdssadrflnslgrfmtg 

ldtr>jnvkvwfnnkgwhaissflnvinnailra 

nlqkgenpshygitafnhplnltkqqlsevapm 

ttsvdvlvsicvifamsfvpasfwfliqervska 

khlqfisgvkpviywlsnfvwdmcnyvvpatlv 

inflcfqqksyvsstolpvlalllllygwsitplm 

ypasfvfbapstaywltsvnlfigingsvatfvl : 

elftdnklnnindilksvflifphfclgrglidmv 

knqamadalerfgenrfvsplswdlvgrnlfa 

mavegvvfflitvliqyrffirprpvnaklsplnd 

ededvrrerqrildgggqndileikeltkjyrrk 

rkpavdricvgeppgecfgllgvngagksstfkm 

ltgdttvtrgdaflnknsilsnihevhqnmgycp 

qfdaitelltgrehveffallrgvpekevgkvge 

wairklglvkygekyagnysggltkrklstama 

liggppvvfldepttgmdpkarrflwngalsvv 

kegrswltshsmeecealctrmaimvngrfrc 

LGSVQHLKNRFGDGYTIWRIAGSNPDLKPVQDF 
FGLAFPGSVPKEKHKNMLQYQLPSSLSSLARIFSI 
LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 
DHLBCDLSLHKNQTVYDVAVLTSFLQDEKVKESY 

V • ■ . . 


3181 


A - 


215 


1367 


PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYKIHENGFFKDR 

HWLFTEFPELAPSQNQNHLKDWFLENKSEVPEC 

RNNEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

FPILQTNNDPGLFVYCCDFSSTAIELVQTNSEYDP 

SRCFAFVHDLCDEEKSYPWKGSLDniLIFVLSAI 

VPDKMQKArNRLSRLLBCPGGMVLLRDYGRYDM 

AHT D "PVI^ nor*! Qr;XTVV\7TJri'rvr;'TP\rVT7T?TOT7'PT T\ 
i\\^\JisX t!^JSSj\^y^ I VKLjJL/u1K.V Yrr lv^iixiL.JL/ 

TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 
WIQCKYCKPLLSSTS 


3182 


A 


3 ■ 


1289 


GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid-residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EXSIutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methiooine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=l]nknown, *=Stop codon, A^ossible nucleotide deletion, 
V=possiblc nucleotide insertion 










AEEENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKEN AAAPSPVRAPAPSPA 

KEERKTEWMNSQQTPVGTPKDKRVSNTPLRtV 

DGSPMMKAAMYSVEITVEKDKVTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

LSSSE^^JELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMTFMGYQNVEDEAETKKVLGLQDTITAEL 

ATVIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 


3183 


A 


333 


1931 


lAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR , 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DWVPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 


3184 


A, 


1 


1004 


gsthasadawaqwfctealvmgapvwylvaa 

allvgfilfltrsrgraAsagqeplhneelagag 

rVaqpgplepeepraggrprrrrdlgsrlqaqr 

raqrvawaeadeneeeavilaqeeegvekpaet 

hlsgkigakklrkleekqarkaqreaeeaeree 

RIGILESQREAEwKKEEERLRLEEEQKEEEERKA 

reeqaqreheeylklkeafweeegvge™tee 
qsqsfltefinyikqskvvlledlasqvglrtqd 

TINRIQDLLAEGTITGVIDDRGKFIYITPEELAAVA 
■NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 


3185 


A 


2981 


7173 


cllagkfsstlyetggcdmslvnfepaarrasni 
cdtdshvssstsvrfyphdvlslpqirlnrllted 
tdlleqqdddlspdlaatygpteeaaqkvkhyy 
rfwilpqlwiginfdrltllalfdrnreilenvla 
vilailvaflgsilliqgffrdiwvfqfclviascq 
ysllksvqpdsssprhghnriiaysrpvyfciccg 
liwlldygsrnltatkfklygitftnplvfisard 
lvivftlcfprvfngllpqvntfvmylceqldihi 
fggnattsllaalysficsivavallyglcygal 
kdswdgqfflpvlfsifcgllvavsyhlsrqssdp 
svlfslvqskifpkteeknpedplsevkdplpekl 
rnsvserlqsdlvvcivigvlyfaihvstvftvlq 

PAT w\/T VTT \/riT?\^/^T?\^'njv\rr 'Df\\TT>vf\i xiwrcs 
rAJLlS.1 Vlyl ILVLrr VOr V IH 1 VLJrt^VKJi.(^ljrWri 

CFSHPLLKTLEYNQYEVRNAATMMWFEBCLHVW 
LLFVEKIWPLIVLNELSSSAETIASPKKLNTELG 
ALMlTVAGLBaXRSSFSSPTYQYVTVIFrVLFFKF 
DYEAFSETMLLDLFFMSDLFNKLWELLYKLQFVY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

DUclCDtide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine C^Cysteine, D=Aspartlc Acid, 
OGIutamic Acid, F=Phenylalanine, G=Glydne, H^Histidine, 
l=Isolcndne, K=Lysine, L=Leudne,.M=Mcthioninc, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nknown, *^top codon, /=possible nudeotide deletion, 
\=possible nudeotide insertion 








> 


TYIAPWQITWGSAFHAFAQPFAVPHSAMLFIQAA 

VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGHIPHMLSFNAAFSQRWLAWE 

VrVTKmEGYSITDNSAASMLQVFDLRKVLTTY 

YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 

RNYVDVDPTFNPNIDEDYDHRLAGISRESFCVIY 

LNWIEYCSSKRAKPVDVDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVWPGIRMSrKLHQDHFT 

SPDEYDDPTVLYEATVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALKNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSSTSQISLKNLPSSIQSRLSMVNQ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGNTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEGIRL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSHJDKAVLLVQIDDKYVTVIETGVLELGAEV 


3186 


A . 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLWEWQLQDD 

KNQSLFCWEIPVQIVSHL 


J 16 / 


A 

A 


i 


.470 


SLSAMRFLAATTLLLALSTAAQAEPVQFKDCGSV 

DGVKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQBCDKTYSYLNKLPVKSEYPSIKLWEWQLQDD 

KNQSLFCWEIPVQIVSHL 


3188 


A 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKNRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGn<CFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG ' 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEICERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQIAQLETALKSDLTDKTEILDRL 

KTCRDQNEKLVQENRELQLQYLEQKQQLDELKK 

RKLYNQENDINADELSEALLLnCAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLIMQHKINKDYQMEVEAVTKKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDL^YGTK 

QYKFKPEIMPDDSVDEFDETIHLERGENLFEIHIN 

JS.V IrbbEVJjV^AaOJJisJirV IrCl YAr i JJrr.JLyl ir 

WRGLHPEYNFTSQYLVHVNDLFLQYIQPCNTITL 

EVHQAYSTEYETIAACQLICFHEILEKSGRIFCTAS 

LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 

AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTD 
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wo 01/57190 
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SEQBO 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
• corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCj'steine, D=Aspartic Acid, 
E==Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M'^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T='nireonine, V=Valine, W=Tr}'ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KITOFADHDTAIIPSShnDPQFDDHMYFPVPMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLAHDRCISGIFELTDHQKHPAGTIHVILKWKFA 

YLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKNIKQPSEBORIEnALSLNDSQVTMDDTIQRLFV 

CV^Kr I ol^rAiilllr VoJ-»risJ:^jVoO^^ W V i ilN i oIn VI Y 

VDKENNKAKRDILKAILQKQEMPNRSLRFTWS 
DPPEDEQDLECEDIGVAHVDLADMFQEGRDLIE 
QNIDVEDARADGEGIGKLRVTVEALHALQSVYK 
QYRDDLEA 


3189. 


A 


476 


1175 


MKGSGWHLRSGMVGTLITTILPHWRRTAHVGTN 
ILTAVSYLKGLWMECVWHSTGIYQCQIYRSLLA 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 
NDVVQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 
IGGTLLCLSCQDEAPYRPYQAPPRATTTTANTAP 
AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 


3190 


A 


267 


1037 


DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAKLS 

DGFREGAESELMRDAQLNDGAMETGTLYLAEE 
DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 
GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 
RQFKVVTRSQEDPNAYSGELFDA 


3191 


A 


29 


574 


GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

r I lOJJAuAobl y rMyUoAljKjtlNOr V VLii.vjKl'UK. 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 
PSTHNMDVPNKRNDYQLICIQDGYLSLLTETGE 
VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 
MSEEYAVAIKPCK 


3192 


A 

' - ■ 


105 


1661 


KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 
WIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 
GRVQSVKLGHATFELGATWIHGSHGNPIYHLTE 
ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 
GRRIPKDWEEFSDLYNEVYNLTQEFFRHDKPVN 
AESQNSVGVrniEEVRNRIKNDPDDPEATKRLKL 
. AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 
GAHHnPSGFMRWELLAEGIPAHVIQLGKPVRCI 
HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 
EEPRGGRWDEDEQWSVWECEDCELPADHVIV 
TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 
UrJ-Tl^Criltrr WurtL-WoLyr V WtUt Atari 1 L J Y 
PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 
LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 
LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 
LPYTESSKTATK 


3193 


A 


1 


192S 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
ANLSWFKDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVFFQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
~ location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence 


Amino acid sequence (A=Alanine G=Cysfeine, D=Aspartic Add, 
E=Glutamic Add, F=Phenylalanine, G=Glydnc, H==Histidine, 
I^Isoleudne, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, \V=Trj'P'ophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
V=^ossible nudeotide insertion 










RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFRSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFroGDLIUCISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANrrFAQNETFA 

LLGTnQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNR 

LLQVITQTLQDLLKALKGLWMSSQLELMAASL 

YNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG 

LFLEGARWDPEAFQLAESQPKELYTEMAVIWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVEIPTHQPQRHWIKRGVALICALDY 


3194 


A 


1 . 


1023 


DGWTPVHAAVDTGNVDSLKLLMYHRIPAHGNS 

FNEEESESSVFDLDGGEESPEGISKPWPADLINH 

ANREGWTAAHIAASKGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKIPLRIS 

VGEIEPSNYGSDDLECENTICALNIRKQTSWDDFS 

KAVSQALTNHFQMSSDGWWSLEDVTCNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 


3195 


A 


1 


1809 


MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGBaLLEKMSSERDGLGSDDGVCTKI . 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRJR. 

SHLTRHQRIHSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGICAFRDRPGFIRHYUHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCES ADLIQHYIIHTGEKPYKCMECG 

KAFNRRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYEGKECGKAFSDRADLIR 

HFSIHTGEKPYECVECGKAFNRSSHLTRHQQIHT 

GEKPYECIQCGKAFCRSANLIRHSnHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRlsIPTIVTDVGRP 

FMTAQTSVNIQELLLGKEFLNITTEENLW 


3196 


A 


1400 


264 


VGFWERPLRSSRWFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTWLFVPQQ 

EAWWERMGRFHRJDLEPGLNILffVLDRIRYVQSL 

KErVINVPEQSAVTLDNVTLQE)GVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESLNASrVDAINQAADCWGIRCLRYEIKDIH 

yPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEBCAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNmLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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wo 01/57190 



PCTAJSO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D^Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K°=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T^Toreonine, V=Valine, W=Tryptopnan, Y=TjTosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










ELDRVKMS 


3197 


A 


66 


3632 


LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREYCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGWYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIAVQGMNNEVMKVLLEHRTIDV 

NLEGENGNTAVIIACTTNNSEALQILLNKGAKPC 

KSNKWGCFPIHQAAFSGSKECMEnLRFGEEHGY 

SRQLHINFMNNGKATPLHLAVQNGDLEMIKMCL 

DNGAQBDPVEKGRCTAIHFAATQGATEIVKLMIS 

SYSGSVDIVNTTDGCHETMLHRASLFDHHELAD 

YLISVGADINKroSEGRSPLILATASASWNIVNLL 

LSKGAQVDIKDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRIN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKWQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEWLTURSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMTILWNIKPGMAFNSTGIINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWnYTTGnFVLPLFVEEPAHLQ 

WQCGAIAVYFYWMNFLLYLQRFENCGMVMLE 

VDLKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSnQTFSMMLGDINYRESFLEPYLRNELAHPV 

LSFAQLVSFTIFVPIVLMNLLIGLAVGDIAEYQKH 

ASLKRJAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLnQKMEn 

SETEDDDSHCSFQDRFJOCEQMEQRNSRWNTVLR 

AVKAKTHHLEP 


.3198. 


A 


51 


2177 


KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

VPATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNE^JQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLRIHSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPnCCEFCNVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEHR 

EKCSECSYSCSSKAALRJHERIHCTVRPFKCNYCS 

FDSKQPSNLSKHMKKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKirVGHQVPQANT 

rVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSLIAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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wo 01/57190 . PCT/DSOl/04098 



SEQDD 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
■ corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutaniic Add, F=Phcnylalanine, G=Glydne, H=Histidine, 
I^Isoleudne, K=Lysine, L=Leudne, M'^Methionine, 
N-=Asparagine, P="ProIine, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *^top codon, A^possible nndeotide deletion, 
V^possible nudeotide insertion 

<; 










GATLHQTLPTASGGPQEGSGNQTFITSSGITCTD 

FEGLNALIQEGTAEVTWSDGGQNIAVATTAPPV 

FSSSSQQELPKQTYSnQGAAHPALLCPADSIPD 


3199 


A 


13 


2247 


QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGViFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKKNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRWGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPIAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AVVLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGVYGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQIDEIKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNniTEPYFNFTSIQESMNEILFEEYQFQ 

A VLRVNAGALSAHRYFRDNPSELCCnVDSGYSF 

THIVPYCRSKKKKEAIlRINVGGKLLTNHLKEnSY 

RQLHVlvroETHVINQVKEDVCYVSQDFYRDMDI 

AKLKGEENTVMIDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSDIGIQE 

MGIPEAIVYSIQNLPEEMQPHFFKNIVLTGGNSLF 

PGFRDRVYSEVRCLTPTDYDVSVVLPENPITYAW 

EGGKLISENDDFEDMWTREEiYEENGHSVCEEK 

FDI 


3200 


A 


3 . 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR . 

SCAGISGKSQLLFALVFTTRYLDLFTSFISLYNTS 

MKVWYAIHRNVFHLQCTGLWTLNLCQLCIFN 


3201 


A 


1 


469 


IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGRHRWVVYTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPFITKPLTARKFIWTNHKFNVT 

GTPEQYVPYSTTRKKIQEWIPPSTPYK 


3202 


A 


144 


840 


NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 

PQWRVSAFffiNNiVVFENFWEGLWMNCVRQANI 

RMQCKIYDSLLALSPDLQAARGLMCAASVMSFL 

AFMMAILGMKCTRCTGDNEKVKAHILLTAGnFn 

TGMWLIPVS WVANAIIRDFYNSI VNV AQKRELG 

EALYLGWTTALVLIVGGALFCCVFCCNEKSSSYR 

YSIPSHRTTQKSYHTGKKSPSVYSRSQYV 


3203 


A 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICrLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 


3204 


A 


1808 


668 


PESAPLPAFISSRILPAAWRNWCSYVVTRTISCHV 

QNGTYLQRVLQNCPWPMSCPGSSYRTVVRPTYK 

VMYKTVTAREWRCCPGHSRVSCEEVAGSSASLE 

PMWSGSTMRRMALRPTAFSGCLNCSKVSELTER 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to flrst amino 
acid residue of 
peptide ' 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide . 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-KJIutamic Acid, F=Ph[:nylalanine, G=Glycine, H=Histidine, 
I=Isolencine, K=Lysine, LF=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=nireonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










DPGEKSHWGEGLHQLREALKILAERVLILETMIG 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRJ 
VAPRSRDERG 


3205 


A 


2810 


1652 


RTSTQKWQSVFNDSQEHLERFYCNPENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVIRLffiE 

ANSRGLKEVRFMMWNNHYILHNSFFRREDCRRP 

LFRSCFILLPYLQU-GGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRHYNLFHKTVPEFKYRILQILRVQNQFLWEKY 

KRKKEYMNRKMFGRDRIINERHLFHGTSQDVVD 

GICKHNFDPRVCGKHATMFGQGSYFAKKASYSH 

NFSKKSSKGVHFMFLAKVLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSr 


3206 


A 


297 


4500 


CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEENLMSmKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQfflEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLEEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLffiHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKRFSSErraKDWQQNNTLIEEMLYL 

HMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVIEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEHVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRWIL 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFNNRLNFSDQPNLTQWIRnSQQIKALQFLRKE 

ESTPNNASTBCNSENVDELQLPEGFRPDFRPKIPYS 

ESnCEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSWQGHFCKPFASLVPND . 

SHEELPCDLDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHrFHLVTMAHnQILLTSCmENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEEPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to lirst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqaence 


Amino add sequence (A=AIanine C=Cysfeine, D=Aspartic Add, 
E=Glutamic Acid, F=Piienylalaninc, G=Glydne, H=Histidine, 
l=Isoleudne, K=Lysine, Lf=Leucinc M=Melhionine, 
N=Asparagine, P=Proline, Q=Glutamine, R"=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=TjTosine, 
X=Unknown, *<=Stop codon, A^possible nudeotide ddetion, 
\=possible nudeotide insertion 










RRGNPLHLCBCERFKKIQKLWHQHSVTEEIGHAQ 
EANQTLVGIDWQHL 


3207 


A 


49 


963 


QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGP\nnGELPQDFLRITPTQQQRQVQLD 

AQAAQQLQYGGAVGTVGRLNITVVQAKLAICNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

>aCVIHCTVPPGVDSFyLEIFDERAFSMDDRIAWT 

ml IrJc.oLK^^tjIS.Vt.L'Is.W i bLoOKViOlJUlNJbuIVLINL 

VMSYALLPAAMVMPPQPWLMPTVYQQGVGY 
VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 
DLKAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 
SLLQMGEEP 


3208 


A 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVAL>IMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVW 

GLADLLSKHDSQHKLSEVITGDLLIIMAQnVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVILSLLL 

VrM r Yir AOor aONrKu 1 LcIJAljUArCy VuyQ 

LLAVALLGNISSIAFFNFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS . 


3209 


A 


104 


1999 


AKWSLKEFSCFWRREKPVSSLSSLQVKAEASW 

DSAVHGCPQLSRGTPVDERLFLIVRVTVQLSHPA 

DMQLVLRKRICVNVHGRQGFAQSLLKKMSHRSS 

IPGCGVTFEIVSNIPEDAQGVEEREALARMAANV 

ENPASADSEAYEEKYLRSVLAVENLLTLDRLRQE 

VAVKEQLTGKGKLSRRSISSPNVMO-SGSRQDLIP 

SYSLGSNKGRWESQQDVSQTTVSRGIAPAPALSV 

SPQNNHSPDPGLSNLAASYLNPVKSFVPQMPKLL 

KSLITVRDEKRGKRPSPLAHQPVPRIMVQSASPDI 

RVTRMEEAQPEMGPDVLVQTMGAPALKICDKP 

AKVPSPPPVIAVTAVTPAPEAQDGPPSPLSEASSG 

YFSHSVSTATLSDALGPGLDAAAPPGSMPTAPEA 

EPEAPISHPPPPTAVPAEEPPGPQQLVSPGRERPDL 

EAPAPGSPFRVRRVRASELRSFSRMLAGDPGCSP 

A C/^XT A T> A Ttr^ A ^Oi^ AT A Cr\CC7T7 A n\'C\7TS'DTirT T>C/^ 

VjAfcuNArAruAuOvAi^AbUobllAUJlVrtWLKiivj 
EFVTVGAHKTGVVRYVGPADFQEGTWVGVELD 
LPSGKNDGSIGGKQYFRCNPGYGLLVRPSRVRR 
ATGPVRRRSTGLRLGAPEARRSATLSGSATNLAS 

T TA AT AT^ AT^TJQTJVTsTDTrKTDVCWA Q 


3210 


A 


324 


694 : 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALWS 
GGIVGYVKTGSVPSLAAGLLFGSLAGLGAYQLY 
QDPRNVWGFLAATSYTFVGVMGMRSYYYGKF 
MPVGLIAGASLLMAAKVGVRMLMTSD 




A 

A 






WA>JFTILALGVWAVAQRDSIDAISMFLGGLLATI 
-FLDIVHISIFYPRVSLTDTGRFGVGMAE.SLLLKPL 

TIDSAEAPADPFAVPEGRSQDARGY 


3212 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKTMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Metiiod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine OCysteine, D=Aspartic Acid, 
E^GIntamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leudne, M=Methionine, 
l**=Asparagine, P=ProIine, Q^GIutamine, R=Arginine, S=Serine, 
T=Thrtoninc V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nndeotidc deletion, 
V=possible nudeotide insertion 










AFQNSSEREDCNNGEPPRKUPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMGHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKBCLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKnQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRLWDreCGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRt 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 


3213 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 
ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 
CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 
AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 
RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 
MTVPKQRKLSASYEKEKELCVKYFEQWSESDQV 
EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 
PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 
VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 
QYLFKNKPPDGNAPPNSFYRALYPKnQDIETIES 
NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 
SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 
DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 
. VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 
RRVLVGHRAAVNWDFDDKYIVSASGDRTIKV 
WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 
SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 
RTVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 
PAAQSEPPRSPSRTYTYISR, 


3214 


A 


1 


1962 


FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKDPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKHQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERN^ITGSSDSTVOIVWDVNTGEMLNTLIHHCEA 

VLHLKriNNoMMV 1 CbKDRblAVWDMASFTDI IL 

RRVLVGHRAAVhATVDFDDKYTVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RrVSGAYDGKnCVWDLVAALDPRAPAGTLCLRT 



286 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence . 


Amino acid sequence (A^'AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid,F=Phen)'lalanine, G=Glycine, H^Histidine, 
I==IsoIeucine, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutaniine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X— Unknown, *=^top codon, A^ossible nucleotide delctiOD, 
V=possible nudeotide insertion 










LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 
PAAQSEPPRSPSRTYTYISR 


3215 


A 


2 


1376 


EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRWTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLDSrWTGEGVNDVRKGACASHVS 

TMASFLKGAfirVTINARAEEDVEPECIMEKVAKA 

SGAKfYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEKRVGKDSFWAKAEKEEENRRLEEKJRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWEQQQEWSRNRNEQESAVHPREIFKQKERA 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 

AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 

PEQETFYEQPPLVQQQGAGSEfflDHHIQGQGLSG 

QGLCARALYDYQAADDTEISFDPENLITGIEVIDE 

GWWRGYGPDGHFGMFPANYVELIE 


3216 


A 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADLSW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMrRVRWDNSALGNSPYHRAPRCIHVYKKN 

GVGKVGDQILLAIKGQKKKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPffTSLRKREG 

EYSKVLAIAQNFV . 


3217 . 


A. 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW . 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

WAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

hWWQKLDHWLMSNSSELMITHALERVCSVMP 

ASITKECULVDTYSPSLVQLVAKITPEKVCKFIRL 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 

CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 

YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 

GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 

NAVQHCQKHVWKEMHLHAGEHA 


3218 


A 


1 


1563 


MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 
CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 
. VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 
PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 
QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 
MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 
RSNLTLADLNIQEQCESLGPGLAVLCBCNYLFQFF 
VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 
V VAMDGVPSLELGLPRKQSEMQ1V4KAG VTCEVC 
MNVVQKLDHWLMSNSSELMITHALERVCS VMP 
ASITKECnLVDTYSPSLVQLVAKlTPEKVCKFIRL 
CGNRHRARAVHDAYAIVPSPEWDAENQGSFCNG 
L-JsjsljJLl VooJiJNjLiiol^ IrUCUlLVArKGOCo 
YMEQCaKHFVTQYEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVQHCQKHVWKEMHLHAGEHA 


3219 


A 


1623 


572 


TSAEGWKGCTCTFKDRSKLREHLRSHTQEKWA 



287 



wo 01/57190 



PCT^SO 1/04098 



SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or- 

peptide 

sequence 


Predicted end 
nucleotide 
location 
' corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIaninc, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=€lutamine, R^Argininc, S=Serine, 
T=Threonine, V=Valine, W=Trypfophan, Y=Tyrosine, 
X=l)Dknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKJIFATERLLRDHMRNHVNHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVmW 

NQTNAQGQQEIVYYVLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 


3220 


A 


2760 


745 


SLGIPSGNTRGTGLVLDGDTSYTYHLVGMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

GRPTWALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENVVDREQEDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP ' 

HLANGHWPEKPQVKGWREENKVRAVPTWAS 

VQWDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASfPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGWRWEYFRLR - 

PLRFRAPDEPQQAQVPHVWGWEVAGAPALRLQ 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 


3221 


A 


15 


478 


SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYnSVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGVFLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 


3222 


A 


207 


1321 


PLIPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEnQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTnGVGVGAGAYILARYAL>rHPDTVEGLVLINl 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNIITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLVVGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 


3223 


A 


132 


1664 


SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YKLrFKXtjKLLESDYFRYYKVNLKRPCPFwNDIS 

QCGRRDCAVKPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVTYEENCFKPQTIKRPLNPLASGQG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Add, 
E=Clutamlc Acid, F=Phenylalanine, G=GlyciDe, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T='nirconlne, V=Valine, W=Tryptophau, Y=Tyrosinc, 
X=Unknovvn, *=Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

AGDKKEAHKLKEDFRLHFRNISRIMDCVGCFKC 
RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 
EFHLTRQErVSLFNAFGRISYKCERIRKTSRNLLQ 
NIH . 


3224 


A 


2 . 


803 


PGSnSWDRDAAGESGTRAASPSPSGSRTAGRLP 
SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 
LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TTr^TrMTT Cl^TT^/VT ■CTM>'F\rDT 11 ^*^■T A OOtTTJ 17T> CT 

JHjUJrI/OlvliVlYLc.UKl VKjLyLWUlAvjl^bKrKbL 

IPSYIRDSTVAVWYDITNLNSFQQTSKWIDDVRT 

ERGSDVIIMLVGNKTDLADKRQITffiEGEQRAKE 

LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMIDIKLDKPQEPPASEGGCSC 


3225 


A 


3 


5054 


PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 

VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 

SSNNGTSPNPIHIWDKVrVDGSDMEEWPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 

TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 

QTSREQQSKMENAGVNFWSGREQAQIHNTDGP 

KNGNTNSLNLSSPNPMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 

QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 

S\\a3NNNIlSTGGSWNFGPQDSNDNKWGEGNKM 

TSGVSQGBWKQPTGSDELKIGEWSGPNQPNSST 

GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 

EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 

QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 

WDIEEVPRPEGKSDKGTEGWESAATQTKNSGG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS . 

WSSGPQPATPICDEEPSGWEEPSPQSISRKMDroD 

GTSA WGDPNS Y^fYKNVNL■WpKNSQGGPAPREP 

nlptpmtsksasdsksmqdgwgesdgpvtgarh 

psweeeedggvwnttgsqgsasshnsaswgqg 

gkkqmkcslkggnndswmnplakqfsnmgll 

sqtednpsskmdlsvgslsdkkfdvdkramnlg 

dfndimrkdrsgfrppnskdmgttdsgpyfekg 

Gshglfgnstaqsrglhtpvqplnsspslraqvp 

pqfispqvsasmlkqfpnsglspglfnvgpqlspq 

qrkisqavrqqqeqqlarmvsalqqqqqqqqr 
qpgmkhspshpvgpkphldnmvpnalnvglpdl 
qtkgpipgygsgfssggmdygmvggkeagtesr 
fkqwtsmmeglpsvatqeanmhkngaivapgk 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

(0 fir^t amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-=Pbenylalanine, G=Glycine, H-Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M=Metbianine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion' 










TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHNPTHLSNKMWKNHISSRNTTPL 

PRPPPGLTNPkPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFm,NLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRPLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 


3226 


A 


200 


1387 


VPWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLLALCHVDGRVPFRPSSAVLLTELTKLLLC " 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHITPLGLLLLILYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN . 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFVVSCSLWNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 


3227 


A 


1 


679 


RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKAKFSGT 

WYAMAKXDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK . 

MKYWGVASFLQKGNDDHWrVDTDYDTYAVQY . 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKTV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 


3228 


A 


430 


1104, 


QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVWGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSWNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

VFDEAILtlFHPKKKKKRCSEGHSCCSn 


3229 


A 


25 


722. 


AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVD VLGLEEESLGSVPAPACALLLLFPLTAQ . 

HENFRkKQffiELKGQEVSPK\hnFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRFSAVALCKAA 


3230 


A 


282 


1479 


GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

uJtbQKWbMVMDKKJtlFKLwRRPlTGTHLYQYKV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALYKLE 

\TERDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPHKSFDENGFDYLLTYSDNPQTVFPR 



290 



wo 01/57190 



PCT/USO 1/04098 



SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino ' 
acid residne of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=Pbenylalanine, G=Glycine, H=Histidine, 
I^Isolcucinc, K=Lysine, Ir^Leucinc, M=Methionine, 
N=Asparagine, P=Proline, (>=Glutamine, R=Arginine, S=SeriBe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V^possible nucleotide insertion 










YCVSVmVSSGMPDFLEKLHMATLKAKNMEIKV 
KDYISAKPLEMSSEAXATSQSSERKNEGSCGPAR 
lEYA 


3231 - 


A 


2117 


590 


fvpeppeagasspcapgdpdmsfrkwrqskfrh 

vfgqpvkndqcyedirvsrvtwdstfcavnpkf 

laviveasgggaflvlplsktgridkayptvcgh 

tgpvldidwcphndexnasgsedctvmvwqipe 

ngltspltepwvleghtkrvgiia whptar>jvl 

lsagcdnwliwnvgtaeelyrldslhpdliyn 

vswnhngslfcsackdksvrndprrgtlvaere 

kahegarpmradfladgkvfttgfsrmserqla 

lwdpenleepmalqeldssngallpfydpdtsv 

vyvcgkgdssiryfeneeppyihflntftskepqr 

Gmgsmpkrglevskceiarfyklherkcepivm 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 

dadpilislreayvpskqrdlkisrrnvlsdsrpa 
mapgsshlgapastttaadatpsgslarageag 

KLEEVMQELRALRALVKEQGDRICRLEEQLGRM 
ENGDA 


3232 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLS ST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFG VISFIVIL VVWI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST . 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3233 


A 


3 


718 


RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 
GTAGAMQLCWVILGFLLFRGHNSQPTMTQTSSS 
QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAG VPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFrvrLWWI 
ILVGWSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 


3234 


A 


1169 


4292 


AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

BCHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNTVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRIMKNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLNILGQKVSMHYSDPKPKINEDWL 

CNKCGVQNFKRREKCFKCG VPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTIILRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVKDKQTQLNRGFAnQLSTIE 

AAQLLQILQALHPPLTEDGKTINVEFAKGSBCRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYQQDEGYGNSQGTESSLYA 

rlU I L.JS.U 1 l\.*jr Ul llj 1 IkXjUr I u AOrbAoLbr (jADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=AspaFagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X°=Unknown, *=^top codon, /^possible nucleotide deletion, 
V==possible nucleotide insertion 










GHKETGAPSKEGKEKKEKHKTKTAQQIAKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQGIVTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 


3235 


A 


3 


1217 


PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 

EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 

LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 

APALCLLLRLGADPAHQDRHGDTALHAAARQG 

PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQILG 

WGPPWDSAEEEEEDDASKEREWRQKLQGELED 

EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 

HAQKCQQQQREAEGSCRPPRAEGSSQSWRQQEE 

EQRLFRERARAKEEELRESRARRAQEALGDREP 

KPTRAGPREEHPRGAGRGSLWRFGDVPWPCPGG 

GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 

RWHPDRFLQRFRSQIETWELGRVMGAVTALSQA 

LNRHAEALK 


3236 


A 


3 


1416 


GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQILPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELDCSFTAHEKTVQFHWRNMHAPGMKKnaD 

TPEEIARWREERRKJWTLANIERKKKLKLEKEK 

RGAVLTTTQYGKMKGMSRHSQMAKIRSPGKNH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVROTSEAKSENRKKSFEKTNPK 

REKRLSQLSNVIRTKNTPSISLGNASSSGHST 


3237 


A 


3806 ' 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRWSRKKMSLKSERRGIHVDQSDLLCKKG 

CG YYGNPAWQGFCSKCWREEYHKARQKQIQED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KKTr^KTRKYITVKKFFSASSRVGSKKEIQEAICA 

PSPSDNRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLiFLEGMHYKRDLSIEEQSECAQDFYHNVAE 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDLAIQKRIRALRWVTPQMLC VP V 

NEDIPEYSDMYVKAITDIIEMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPAS ADDFLPTLIYIVLKGNP 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKM^DliSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDrVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 


3238 


A 


1373 


.449 


VLSVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 

. corresponding 
to first amino ' 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location 

to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^^AIanine C=<:ysteine, D=Aspartic Add, 
E=Glutaraic Acid, F=Phcnylalanine, G=Glycine, H=Histidlne, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, • 
N=Asparagine, P=Pr6line, Q^GIutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y-=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
V^ossible nucleotide insertion 




- 






AYHLAKMGAHVWTARSKETLQKWSHCLELG 
AASAHYIAGTMEDMTFAEQFVAQAGKLMGGLD 
MLmNfflTNTSLNLFHDDIHHVRKSMEVNFLSYV 

VT TVAAI PMT Ifn^TvinQTWA/QCT Anv\/ A'\rD\A\r A 
V Lii V r\^j^riviLi]\\^oiy Kjolvvv ooJ^/VVJlS. v A i rlVl V A 

AYSASKFALDGFFSSmKEYSVSRVNVSITLCVLG 
LIDTETAMKAVSGIVHMQAAPKEECALEIIKGGA 

LROEEVYYD'sST WTTT T fRUPPPTc'TT PPT V<!T«:V>J 

MDRFINK 


3239 


A 


213 


422 


ERTMOLETKVAT NPITFYT VNKT T W/OPT Vk'lf*P A 

HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 


3240 . 


A 


1255 


1425 


HESYHVNPNLCNPVAPTSGAHSIG*KWPSWLGA 

VAHSPMP^TI VOWnrjPTTRnOPT n 


3241 


A 


161 


547 


PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEAS 

QCLDTAQQDLYRKVMLENYRNLVFLAGIAVSKP 
DLITCLEQGICEPWNMKRHAMVDQPPGR 


3242 


A 


50 


241 


PLPARGKSTLPATFCSPSAPELASMSVVPPNRSQT 
GWPR G VTnPrjKFif vrnnTirPT tt PRTnar 

>J w r jvvj V 1 v^r VJlNIv 1 1 rwl^l^ J Jl^t.!^ 1 IXNJL 


3243 


A 


380 


702 


FVAYLKLPFFSQVCLFASSEMFFTISRKNMSQKLS 
LLLLVFGLIWGLMLLHYTFQQPRHQSSVKLREQI 
LDLSKRYVKALAEENKNTVDVENGASMAGYGK 
ITVEYF" 


3244 


A 


37 


1391 


VLMDGRMMRSMRLREEESPGPSHTASCLCGSAP 

CILCSCCPASKNSTVSRLIFTFFLFLGVLVSnMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGfflDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGFWFFKFLILVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLFILIQLVLLBDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSIAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQkCNP 

HT PTr^T GMPTAA^ A nPT?/^VT7X/^Tirw/r\ A 'Dorv/r^T ttc 
rLUr 1 v<!lj'-JINtiI V VAOrJlO itiyw WiJArMVOLlir 

LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATQ' 

QQQQVAAGEGRAFDNEQDGVTYSYSFFHFCLVL 

ASLHVMMTLTNTJ^KPGETRKMISTWTAVWVKI 

PA<?WArrT T T VT 


3245 . 


A 


52 


426 


SSLGNEDDEDLSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 

dldaiwgiweaaagagalitlllmlillvrlpf 
fkekeiocspvglhflfllgtlgp 


3246 


A 


3 . 


515 . 


HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 

RFT TvIVT T?^WT VMV<3TTAlv/m>JTT nQVPTMlXPT \rcv 

lytgkpnlvnglqartfgiwtllssvirclcaidi 
hnktlyhitlwtfllalghflselfvygtaapti 
gvlaplmvasfsilgmlvglrylevepvsrqkk 
rn 


3247 


A 


1 


932 


erlcfpcmqskiysymspnkcsgmrfplqeensv 

thhevkcqgkplagiyrkreekrnagnavrsa 

mkseeqkkdarkgplvpfpnqkseaaeppktpp 

SSCDSTNAAIAKQALKKPIKGKQAPRKKAQGKT 

qqnrkltdfypvrrssrkskaelqseerkrideli 
esgkeegmkidledgkgrgviatkqfsrgdfwe 
yhgdlbeitdakkrealyaqdpstgcymyyfqy 
lsktycvdatretnrlgrlinhskcgncqtklh 
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SEQID 
NO: 


Metliod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location . 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E'^GIutamic Acid, F=Phenylalanine, G=Glydne, H=Histidine, 
I=Isoleucine, K=Lysine, t^Leiicine, M=iVletbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=X)nknown, *=Stop codon, ^^ossible nucleotide deletion, 
\Fpossibie nucleotide insertion 










DrDGVPHLlLIASRDIAAGEELLYDYGDRSKASIE 
AHPWLKH 


3248 


A 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATTGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

IjJfKrr INrJbbKK.UKjiLDbNrrAbLVFY WBPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAWSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTNRLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 


3249 : 


A 


43 . 


1210 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRWEPKRAVSREDSVKPGA 

HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETIE 

VMEDRQSGKKRGFAFVTFDDHDTVDKTWQKY 

HTINGHNCEVKKALSKQEMQSAGSQRGRGGGS 

CjNrMuKuOlNl'OOGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 


3250 


A . 


32 . . 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHG YIEKLGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

rrKJLV lAJNLlllC^NKiiL,tKV10DCKrVSAlrRLr.K 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS . 

GGCSALELKDniDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3251 


A 


32 


1175 


VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGKNESHDrWSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAQKIGFC 

rrKl. V 1 ANJLl 1 HjNKbLbKVlCjDCKJr V!>A IrKLrK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDUTDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 


3252 


A 


1 


574 


. PLGSNTAPALRVMVQAWYMDDAPGDPRQPHRP 
UrOKr VOLcvjLKKLOVLY WKLDADKYENDPELE 
KIRRERNYSWMDnTICKDKLPNYEEKIKMFYEE 

GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 
WTAYNRPADHFEARGQYVKFLAQTA 


3253 


A 


2 • . 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVnLLGLAVGSYLVRRSRRPQVTLLX)PNE 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


' Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanine C=Cystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=IsoIeucinc, K=Lysine, L=Leucine, M=Metliionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Ttireonine, V=VaIine, VV=Tryptopl)an, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










. KYLLRLLDKTTVSHNTKRFRFALPTAHHTLGLPV 
GKHIYLSTRIDGSLVIRPYTPVTSDEDQGYVDLVI 
KVYLKGVHPKFPEGGKMSQYLDSLKVGDWEF 
RGPSGLLTYTGKGHFNIQPNKKSPPEPRVAKKLG 
MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 
TEKDIILREDLEELQARYPNRFKLWFTLDHPPKD 
WAYSKGFVtADMIREHLPAPGDDVLVLLCGPPP 
MVQLACHPNLDKLGYSQKMRFTY 


3254 


A . 


1 


968 


LQSAGEGVTHVLELLESPARPVAAVTQVQRRRY 

HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 

SKJGQRMLEKMGWSKGKGLGAQEQGATDHIKV 

QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

NTCHGQETTDSSDKKEKKSFSLEEKSKISKNRVH 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 

DASPSTPEE>reTrrTSAFTIQEYFAKRMAALKNK 

PQVPVPGSDISETQVERKRGKKRNKEATGKDVE 

SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 

EQLRGPCWDQSSKASAQDAGDHVQPA 


3255 


A 


173 : 


439 


GSAAMKVKIKCWNGVATWLWVANDBNCGICR 

MAFNGCCPDCKVPGDDCPLVWGQCSHCFHMHC 

ILKWLHAQQVQQHCPMCRQEWKFKE 


3256 


A . 


2 


.377 


TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WmSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GMVFLKHGSELRIIPRDRVGSC 


3257 


A 


3 


1454 


. GCS AAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEKIRNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EBCFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAPLYNDQLIWSGLEQ 

DDMRJLYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCKFPKIFVNTD 

DTYEBLHLIVYKAMSAAVCFMIDASVHPTLDFC 

RRLDSrVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFIYFimiNLAEKSTVHMRKTPSVSLTSVHPD ■ 

LMKJLGDINSDFTRVDEDEEUVKAMSDYWVVG 

KKSDRRELYVILNQKNANLIEVNEEVKKLCATQF 

NNIFFLD 


3258 . 


A 


113 


1558 . 


APRGCSMPHRKKKPFBEKKKAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELPSSTFSAHNRREEK 

EETLVIPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDrVAALDDDFDFDDPDNLLEDDFEL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRAIADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

NDYYKEKAENCVKLNTLEPLEDQDLPMNELDES 

EEEEMITWLEEAKEKWDCESICSTYSNLYNHPQ 

-LDCYQPKPKQIRISSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPKVSTQPRSIQ^IESKEDKRARKQAI 

KEERKERRVEKKANKLAFKLEKRRQEKELLNLK 

KNVEGLKL - 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H==Histidine, 
I==Isoleucine, K=Lysine, L^Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc S=Serine, 
T=Threonine, V=VaIine, W=Tr>'ptophan, Y=Tyrosine, 
X=Unlinown, *'=Stop codon, /-^possible nudeotide deletion, 
V=possible nudeotide insertion 


3259 


A 


3 


964 


QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMMQtQNKVITYIACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKnSSIHAISSAQGKYKAFSTC 

ASHLSWSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTVVTPMLNPFIYSLKNKDIKRALGIHLLWGT 

MKGQFFKKCP 


3260 

■ t 


A 


34 


2573 


IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGILSPSELRKIFSNLE 

DILQLHIGLNEQMKAVRKRNETSVIDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDNIATYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PNVEELRNLDLTKRKMIHEGPLVWKVNRDKTID 

LYTLLLEDILVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGIS VTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ . 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASN1LVMDHN4IMTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVhfL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 


3261 


A 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTKIELLPSYST 

ATLIDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIWSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEDTQLIVESFHFKNGEDAPDLLK 

ATTKPFTKLIVQLDKKVISQIAMNDEKAKNICSLV 

KIWCKTFTNKTQINVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKENIAKCQHIFVNFHLPDLAVGT 

ULILSLLVLCGCLIMIVKILGSVLKGQVATVIKKT 

INTDFPFPFAWLTGYLAILVGAGMTFIVQSSSVFT 

SALTPLIGIGVXTffiRAYPLTLGSNIGTTTTAILAAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

LSLAGWRVLVGVGVPWFmLVLCLRLLQSRCPR 
VLPKKLQNWNFLPLWMRSLKPWDAWSKFTGC 
FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 
DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQID 
NO: 


Metbod 


Predicted 

beginning' 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-^Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
1=1 nreonine, V=Valme, W=Tryptophan, \-Tyrosine, 
X=llnknown, *=Stop codon,>=possible nucleotide deletion, 
V=possible nucleotide insertion 










SDSKTECTAL 


3262 


A 


30 


1377 


SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GEIHNWTELLDLFNHTLSECHVELSQSTKRWLF 

ALYLAMFVVGLVENLLVICVNWRGSGRAGLMN 

LYILNMAIADLGIVLSLPVWMLEVTLDYTWLWG 

SFSCIUTHYFYFVNMYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVRJRAMCAGIWVLSAHPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCAYVAVFVMCWLPYHVTLLLLTLHGTHISLHC 

HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQ'EKAGTCASSSSCSTQHSl 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


3263 , 


A 


1 


919 


QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFDIDQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHWrVDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRQNLPLSSGKSQPAPAEF 

RAQLDPAFIKTYEDIKENLESRRFQVVDSRATGR 

FRGTEPEPRDGIEPGHIPGTVNIPFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDVPIYDGSWVEWYMRARPEDVISE 

GRGKTH 


3264 


A 


1 


1398 


ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGP 










FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDWRKESBSCDCLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTVVE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKLAVNMVPFPRLHFFMPGFAPLTSRGSQQY 

RALTVPELTQQMFDSICNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLmSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 


3265 . 


A . 


265 


862 


WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 

RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 

VKKSALCGEQVHILGCEVSEEEFREGFDSDINNR 

LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 

RTDPVPVTIALDSLSWLLLRLPCTTLCQVLHAVS 

HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 


3266 


A 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LGQLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 

ILWDLKFLMRNLALGGGLLLLLAESRSEGKSMF 

AOVrlMKJibbPKt^YMQLGGRVLLVLMFMTLLH . 

FDASFFSIVQNIVGTALMILVAIGFKTKLAALTLV 

VWLFAINVYFNAFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLWALGPGGVSMDEKKKEW 


3267 


A 


802 


1011 


ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalaninc, G=Glycine, H=Histidine, 
I=IsoIeucine) K=Lysine, L=LeuciDe, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=n)reonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=UnknowD, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










LCFVFGTAALSIRSMDVLSLFLEHGKXVFASGLSP 
RA 


3268 


A 


490 


679 


EDAWITNPSLSNARSTPSKPLCYTVLKEGQWGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 


3269 


A 


2 


796 


GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

MTTNAGPLHPYWPQHLRLDNFVnPNDRPTWHILA 

GLFSVTGVLWTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWVVIAFLRQHPLRFILQLWSVGQIYGDVLYF 

i, rEHRJjuFQHOELGHrLYFWr X FXTrMNAiyWL V 

LPGVLVLDAVKHLTHAQSrLDAKATKAKSKJCN 




A 

A 


17 


oof* 

229 


GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSOT 
KYLTKYSRKQVSDEIKKSRRTVESNPIFFiaCNKKI 
Q 


iz/ 1 


A 

A 


419 


ceo 

553 


IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 


3272. 


A . 


1211 


1450 


FQFIQIELLNILQSLIRNQTQSPYNfTrAYPAIDSVIT 
ILPFSFSCFFnTKCFGLSIFPSVIFFLHVYFILTLWF 
YCC 


3273 


A 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTEFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKKLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGEEMKNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAjfGDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEYINLMKNI<L 

DPEGLGHLLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAWMGFEDPMLQTD 

DTPIKRCLQTKWPYIELLWTTDRSPSLN 


3274 


A 


186 


1358 


RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRWRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLYIARPLLHLLSLGLWGQRSWK 

rWLLAu v.VUV laLbLLbDRKGLTRKliRRELRRR 

TILLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 




A 


J /J 


Ten 
/59 


5s V Y b AS SCKCCN Y RK. 1 bQlPDCEQPP ASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 


3276 


A 


7 


258 


KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 
CAFTVRHYMCSSHTG 


3277 


A 


9 


2221 


KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MSNPELKDKPLGVQQKYLWTCNYEARKLGVK 
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SEQID 
NO: 


Metfaod 


Predicted 
beginning 
nucleotide 
location 
• corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
. nucleotide 
location . . ' 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M^Methlonine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stopcodon,Mpossible nucleotide deletion, 
\r=possiblc nucleotide insertion 




. 






KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPWERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLfflSLNHIKEPGIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNKIEELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDlLMKLFKNnvlVNXnKMPFHLTLLSVCFCNLKAL 

NTAKKGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKETNRDFLPSGRIESTRTRESPLDTTOF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFQGKGSVSCPLHASRGVLSFFSKKQM 

(^DlPlNrKlJHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRlfflTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 


3278 


A 


1 


876 


GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 

KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL . 

YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

IGCGYGGLLVELSPLFPDTLDLGLEIRVKVSDYVQ 

DKiRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFKRTKHKWRESPTLLAEY • 

AYVLRVGGLVYTITDVLELHDWMCTHFEEHPLF 

ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 

NFPAIFRRIQDPVLQAVTSQTSLPGH 


3279 


A 


82 . 


2929., 


TRTKRRLGREKAMASPPRGWGCGELLLPFMLLG 

TLCEPGSGQIRYSMPEELDKGSFVGNIAKDLGLE 

PQELAERGVRTVSRGRTQLFALNPRSGSLVTAGM 

DREELCAQSPLCVVNFNILVENKMaYGVEVEn 

DINDNFPRFRDEELKVKVNENAAAGTRLVLPEA 

RDADVGVNSLRSYQLSSNLHFSLDWSGTDGQK 

YPELVLEQPLDREKETVHDLLLTALDGGDPVLSG 

TTHIRVTVLDANDNAPLFTPSEYSVSVPENIPVGT 

RLLMLTATDPDEGINGKLTYSFRNEEEKISETFQL 

DSNLGEISTLQSLDYEESRFYLMEWAQDGGAL 

VASAKVWTVQDVNDNAPEVILTSLTSSISEDCL 

PGTYIALFSVHDGDSGENGEIACSIPRNLPFKLEK 

SVDhrraiLLTTRDLDREETSDYNITLTVMDHGT 

PPLSTESHIPLKVADVNDNPPhnFPQASYSTSVTEN 

NPRGVSIFSVTAHDPDSGDNARVTYSLAEDTFQG 

APLSSYVSINSDTGVLYALRSFDYEQLRDLQLWV 

TASDSGNPPLSSNVSLSLFVLDQNDNTPEILYPAL 

PTDGSTGVELAPRSAEPGYLVTKVVAVDKDSGQ 

NAWLSYRLLKASEPGLFAVGLHTGEVRTARALL 

DRDALKQSLVVAVEDHGQPPLSATFTVTVAVAD 

RIPDILADLGSIKTPIDPEDLDLTLYLWAVAAVS 

CVFLAFVIVLLVLRLRRWHKSRLLQAEGSRLAG 

VPASHFVGVnnxn^AFT DTYSHTRV*;! TAn^RTiT'?'!-! 

LIFPQPm'^ADTLLSEESCEKSEPLLMSDKVDANK 
EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 
GTWPNNQFDTEMLQAMILASASEAADGSSTLGG 
GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparlic Acid, 
E=GIutaniic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, lM=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X— Unknown, '■^Stop codon, /=possibIe nucleotide deletion, 
^possible nucleotide insertion 










VYIPGSNATLTNAAGKRDGKAPAGGNGNKKKS 
GBaOEKK 


3280 


A 


149 


1288 


GTSQMSSHKGSWAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRVl 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RIHTETGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKWFGMFFLGAVLCLSFSWLFHTVYCH 

SEKVSRTFSKLDYSGIALLIMGSFVPWLYYSFYCS 

PQPRLIYLSIVCVLGISAIIVAQWDRFATPKHRQT 

RAGVFLGLGLSGWPTMHFTIAEGFVKATTVGQ 

MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI 

WFQSHQFHVLVVAAAFVHFYGVSNLQEFRYGL 

EGGCTDDTLL 


3281 


A 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEiaA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKLAVhWIAGffiEVNMIKDDGTVIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 


3282 


A 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

LAPGAAAFAGLGGAPRFPPRGSAAGRTMLLKEY 

RICMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AWPKIFYVTEKA WNYYPYTITEYTCSFLPKFSIH 

lETKYEDNKGSNDTIFDNEAKDVEREVCFIDIACD 

EIPERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKWR 

DILLIGHRQAFAWVDEWYDMTMDDVREYEKN 

MHbQ 1 NIKVCNQHSSPVDDIESHAQTST 


3283 


A 


159 


547 


nCSKLNQQVEVQESEWRLTEAKGPTMGKESGW 
DSGRAAVAAWGGWAVGTVLVALSAMGFTSV 
GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 
VGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


3284 


A 


227 


637 


TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEGMAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 


3285 


A 


123 


1535 


HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVPIQANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPS VLEVREKG YERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREAMKQATAEKQLKEAQGBQ 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF " 

KKGHTKNKSTSSAMSGSHQDLS VIQPIVKDCKEA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 

CLTFSKSELASAVLEAVENNTLSIEPVGLQPmFV 

rk-f\o/\ V c<\-'Vjvjx^isj\.^/VJ_i 1 ov^oivo^jSJriisjJvi-.vji-'oolN 

YYYISPFCRYRITSVCNFFTYIRYIQQGLVKQQDV 

DQMFWEVMQLRKEMSLAKLGYFKEEL 


3286 


A 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKSIAS 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^'Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I^boleudne, K=Lysine, L>=Leucine, iM=Methionine, 
N=Asparagine, P=Proline, Q=^lutamine, R=Argininc, S=Serinc, 
T>=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nknown, *=Stop codon, A^ossible nucleotide deletion, 
V=possibIe nucleotide insertion 










ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
SQSRHSAQIHTPVAirELELGKYGQESEFLCLEFP 
EVKVNQILKTLSEVEESISTLISQPN 


3287 


A 


50 


390 


LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVRICDECNYGSYQGRCVICGGP 
G VSDAYYCKECTIQEKDRDGCPKrVNLGSSKTDL 
FYERKKYGFKKR 


3288 


A 


3 


428 


RTTFFRFRPCESLCGDMKLLTHNLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPmiLLSE 

EETES 


3289 


A 


1 


1743 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTl 

HGSPREDTGTPRSREMMFQPSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTKEKPYDGKECTETFISHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

SDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQI 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGICAFRYFSSLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTINR 


3290 


A 


2 


1350 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAWW WSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDWLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPAIFRLIARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

KRFLSDIPSSQILQEEMTWMKEILSNLGSPWLCH 

NDLLCKNIIYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVEILnQVNQFALASHFF 

WGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 


3291 


A 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLnGNSSVGKTSFLF 

RYADDSFTSAFVSTVGIDFKVKTVFKNEKRIKLQI 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVQDWSTQIKTYSWDNAQVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNlRvKQ 

TFERLVDnCDKMSESLETDPATTAAKQNTRLKET 

PPPPQPNCAC 


3292 


A 


2. . 


4136 


DRPPWNSRVDDFVTNLHLSSKGfflSPAKDTSLQ 
QRTPAEMSPVLHFYVRPSGHEGAASGHTRRKLQ 



301 



wo 01/57190 



PCTAJSO 1/04098 



SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=GIutaniic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^^linknown, *"=Stop codon, ^=possible nucleotide deletion, 
V^possible nucleotide insertion 










GKLPELQGVETELCYNVNWTAEALPSAEETKKL 
MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 
LNFSTPTSTNTVSVCRATGLGPVDRVETTRRYRLS 
FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 
ESMPEPLNGPINILGEGRJLALEKANQELGLALDS 
WDLDFi'TKRFQELQKNPSTVEArDLAQSNSEHS 
. RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 
NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 
QQGLRHWFTAETHNFPTGVCPFSGATTGTGGRI 
RDVQCTGRGAHWAGTAGYCFGNLHIPGYNLP 
WEDLSFQYPGNFARPLEVAIEASNGASDYGNKF 
GEPVLAGFARSLGLQLPDGQRREWnCPIMFSGGI 
GSMEADHISKEAPEPGMEWKVGGPVYRIGVGG 
GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 
NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 
LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 
ALLLRSP>fRDFLTHVSARERCPACFVGTITGDRRI 
VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 
VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 
LERVLRLPAVASKRYLTNKVDRSVGGLVAQQQC 
VGPLQTPLADVAWALSHEELIGAATALGEQPV 
KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 
CSGNWMWAABCLPGEGAALADACEAMVAVMA . 
ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 
AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 
QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 
ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 
NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 
DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 
VSVNGAWLEEPVGELRALWEETSFQLDRLQAE 
PRCVAEEERGLRERMGPSYCLPPTFPKASVPREP 
GGPSPRVAILREEGSNGDREMADAFHLAGFEVW 
DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 
SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 
CNGCQLLALLGWVGGDPNEDAAEMGPDSQPAR 
PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 
MEGAVLPVWSAHGEGYVAFSSPELQAQIEARGL 
APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 
DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 
SPWLQLFINARNWTLEGSC 


3293 


A ■ 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRERV 

AMHYQMSVTLKYEn<KLIYVHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 


3294 


A 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 
WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 
WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 
LRSGKEAKILQHFGDGLCRMLDERLQRHRTSGG 

JJnAJr JJor ouilJN or Ar (s^vjKijAJl V t^lJo 

KAGGSGSYWPARHSGARVILLVLYREHLNPNGH 
HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 
HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 
LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 



302 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Mct&od 


Predicted 
beginning 
nucleotide 
location 
corresponding ■ 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
iiucleotidc 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaBine C=Cysteine, D=Aspartic Acid, ■ 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X°=Unluiown, *=Stop codon, /=possible nucleotide deletion, 
\=pDSSiblc nucleotide insertion 










QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTHTVRKLHVGDFVWVAQETNPRDPANP 

GEL\T.DHIVERKRLDDLCSSIIDGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

VlDGFFvXRTADIKESAAyLAliTKGLQRLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGADCNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQKNLGPALSRTLSQLYCSYGPLT 


3295 


A 


2 


1115 


EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSWIEQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERLIVAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREMKAYEEDYGSSLEEDIQADTSGYLERJ 

LVCLLQG SRDD V S SFVDPAL ALQD AQDLYAAGE 

KmGTDEMKnmCTRSATHLLRVFEEYEKIANK 

SIEDSIKSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTLIRNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 


3296 . 


A 


1 


838 


GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELLGTDSAEPEMDVRKRTGVAGS 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKKDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAMLUVCFEFIS 

MlLFIRaiPKLK 


3297 


A 


46 


617 


HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TGIPGSPACRQPVvGLHSLHNYRMAmVSAMSW 

VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 

TCEVLAAHRCCNKNRIEERSQTVKCSCLPGKVAG 

TTEmRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRIHPRT 


3298 


A 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD 

TiT XTVO OX/XT'^ AT^ A rT~\t VTTT f/^T f TOTNX iTT^T 7 1 T''T ■vrTT'^T' 

PLNKSSYKYEADWDLNWCVISDMEVIELNKCT 
SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 
KKLEAAEERRKYQEAELLKHLAEKREHEREVIQ 
KAIEENM^IKMAI<EKLAQKMES>nCENREAHL A 
AMLERLQEKDKHAEEVRKNKELKEEASR 


3299 


A 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAGVKTLLPVPSFEDVSPEKPKLRFDSRAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

ggylhwghfemmrlti^^lsmdpk^^v^fj^^ 

APFKPITRKSVGHKMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GmKVLSPYDLTHKGKYWGKFYMPKRV 




A 


o 

z - 


1 RAT 
1 o*f / 


CBLVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 
TQNGRLTDFLDCVnSHFHLDHCGALPYFSEMVG 
YDGPIYWrraPTQAICPILLEDYRKlAVDKKGEAN 
FFTSQMIKDCMKKWAVHLHQTVQVDDELEIKA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cystcinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H:=Histidine, • 
I=Isoleucine, K=Lysine, L=Leucine, IM=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codOD, /=possible nucleotide deletion, 
V=possible oucleotide insertion 










YYAGHVLGAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWJDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWT 

NQKIRKTFVQRNMFEFKHIKAFDRAFADNPGPM 

\^ATPGMLHAGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSIPVGISLGLLKREMAQGLLPEAKKPRLLHGTLI 

MKDSNFRLVSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

GSFLTSLLKKGLPQAPS 


3301 


A 


2 


349 


CIRTEPAAAFRRLGALSGAAALGFASYGAHGAQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQH.AP 
AGGTLLLLGWLALAL 


3302 


A 


59 


1184. 


LRRNCSALGGLFQTnSDMKGSYPVWEDFFNKAG 

Ba.QSQLRTTWAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKKARQEI 

KKKSSDTLKLQKKAiCKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALEEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVBLDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M ■ 




A 

A 


CI 1 

511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGWTIEEFffiSCQK 
DENIMRSMQLFDNVI 


3304 


A 


40 


432 


ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSEFNIEMVKEKTAEEIKQrWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQWTGTELHFHCTYKYSDPEGKA 




A 

A 


2 


483 


LDACSTGPYSRSTHASADAWADAWVyVVLKW 
GMTLELLYFPQIFNKSNDGFTTTRSYGTVSQIFGS 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSW1SIESWDFCKGKGCTLAIVDNSETLKLLHDL 
HDAEKNYIALPYRSSKYMSTCNGTF 


3306 


A - 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 

ALADNKPVAPDRRJSGHVGIIFSMSYLESKGLLA 

TASEDRSVRIWKGGDLRVPGGRVQNIGHCFGHS 

ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 

AFRGHQGRGIRAIAAHERQAWVITGGDDSGmL 

WHLVGRGYRGLG/DLGSLLQVP**ARYTQGCDS 

FPVLLPAGGSSWSRGLRTVCYGQWGRSCQGCPH 
QHSNCCCGPDPVSWEGAQLELGPAWL 


3307 


A 


2 


927 


RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAELIALTQAIRWGKDINVNTDSRYAFATVH 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaraic Acid, F=Phenylalanine, G=Glycine, H=HisHdine, 
I^'IsQleucine, K=Lysine, Ij=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, ■W=Tryptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon, /possible nucleotide deletion, 
\=p0ssible nucleotide insertion 










VRGA1CQERRLLTSAEKAIKNKNPPSSKPNRSSS\F 

WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

DFTKVKPHQAGYKYLLVLVDTFSGWTEAFATK 

NETVNMWKFLLNEnPRHGLPVAIGSDNGPAFA 

LSIV*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 


3308 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRynLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3309 


A 


490 


1077 


NSPSLDFNDNEDIPTELSDSSDTHDEGEVQAFYE 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 


3310 

} 


A, 


2 


1198 


SPLCHPGLSRER/S*SEAKLRSGRYC*KRQVEAPL 

*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKHHYSRSRSRSRERKRKSDNEGRKH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

KKSRSRSRERKRRIRSRSRSRSRHRHRTRSRSRTR 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAQEALARRLERAKKLQEQREKEMVEKQKQQ 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 

aqmaalqakalaetglwpsyynpaavnpmkf 
aeqekkrkmlwqgkicegdk:sqsagnmgkn 


3311 


A , 


177 


4 


PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
PGPAHDFPPLSAVLSGHTKT 


3312 


A. ■ 


3 


426 


LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 

P/RPACPV/PGLWPSLLSPAPPQSSGPLLGLSPCPG 

AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSPVS 

ASAPCRAVPLSPRRLTWPPHLQVGILIPTGRPWK 

NL 


3313 . 


A 


162 


2 . 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
L\P\CTPAWVTQRDFFRKKK 


3314 


A 


162 


2 


QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
L\P\CTPAWVTQRDFFRKKK 


3315 


A 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAjH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAHIGRVGRPY 


3316 


A 


3 


2307 


NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 
KSNSMLQKPTVAYVRPMDGQESMEPKLSSEHYSS 
V^bHUMtsM 1 bLlsPSSKAJlLTKLKIPSQPLDASASG 
DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 
FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 
. LKDDLKLSSSEDSDGEQDCDKIMPRSTPGSNSEP 
SHPINSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nocleotide 
location 
corresponding 
to last amino 
acid residue of . 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EMIlutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K^Lysine, L=Leucine, M^^Mcthionioe, 
N=Asparagine, P=Proline, Q=Clutaminc, R=Arginine, S=Serine, 
T=Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unltno\vn, *=Stop codon, /=possible nucleotide deletion, 
^possible nudeotide insertion 










SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

1>«>HKVSPASSVDS>JPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRXAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KJDESETPVDI^SSMPSSRHKAATXGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEJCKNTV^EKHTREAQKQASE 

KVSm:GKRKHKNEDDNRASESKKPKTEDKNSA 

GHKPSSNRESSKQSAAKEKDLLPSPAGPVPSKDP 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 

STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 

TLDSSKPRRTKLVFDDRI4YSADHYLQEAKKLKH 

NADALSDRFEKAVYYLDAWSFIECGNALEKNA 

QESKSPFPMYSETVDLI 


3317 


A 


496 


2 


NLLQDEKLVHSYPYDWRTQETCGYIVPARQWFI 

N\TRDIKTAAKELLKKVKFIPGSALNGMVEMMD 

RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT. 

TEHIVKLVEQHGSDrWWTLPPEQLLPKEVLSEVG 

GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 


3318 


A 


2 


512 . 


AWHEGDSRSDQCHHPYNYGFDYYYGMPFTLVD 
SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF . 
GKLSGWVSVPWLLffSMELFIFLLGYA WFSSHTSP 
LYWDCLLMRGHEITEQPMKAE\RAGSIMVKEAIF 
LFRKGHSKGKLFLLFFLPFLQVHKTFPTTDGFHW 
AP 


3319 


A 


407 


1 


SSLHRSPRPASPLPVPEAP\SFLPVPAPBa>SALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


3320 


A 


4037 


3432 


QMSEAVAEKMLQYRRDTAGWKICREGNGVSVS 
WRPSVEFPGNLYRGEGIVYGTLBEVWDCVKPAV 
GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 
MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 
CPPKPGFVRGFNHPCGCFCEPLPGEPTKTNLVTFF 
HTDLSGYLPQNWDSFFPRSMTRFYANLQKAVK 


3321 


A 


37 


360 


SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 
KGPGSEGTP 


3322 


A 


1 


420 


AIVEDKHSGRSYDITSDLGNVLTSTSIAKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVrVTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVKPIDENLGQTGKSAVCIHQDINDDH 

VEDVT 




A 


8 


459 


DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 
TKTYSNEVVTLWYRPPDILLGSTDYSTQIDMW*G 
QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP 
RGVGCIFYEMATGRPLFPGSTVEEQLHFIFRILSE 

Jti A W /\JjL> A V 1 1 JriK 


3324 


A 


1276 


466 


PGSTHASARrnY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLIPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to lirsl amioo 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E'=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I'^Isoleudne, K^Lysine, Ir^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serinei 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknowo, *=Stop codon, /=possible nucleotide deletion, 
V^ossible nucleotide insertion 










SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 


3325 


A 


266 


3312 


TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 
GSLRVTFASGMEIGLSSEPHILAGAVNPTLGKCNI 
SLPGEHNA2SILISVL**GEQGCA*NVFfflSFS*AHN 
RNLLSIDFDHITRTGKIYDDHRKFTLRILYDQTGR 
PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 
MEYDQSFL*SPQL*LSUCYSAFVSFQSVMLLLHS 
QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 
GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 
RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD . 
LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 
VNARFDYSYNNFRVTSMQAVINETPLPIDLYRYV 
DVSGRTEQFGKFSVmYDLNQVITTTVMKHTKIF 
SANGQVffiVQYEILKAIAYWMTIQYDNVGRMVI 
CDIRVGVDAMTRYFYEYDADGQLQTVSVNDKT 
QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 
TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 
KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 
QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 
LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 
EELYTPYGDIYHDTYPDFQVIIGFHGGLYDFLTKL ' 
. VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 
FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL . 
HNVLPGFPKPELENSPSPQMSNSMLHLLCASLS* 
TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 
GGKQPRFAAVPSVFGKGIKFAIKDGIVTADnGVA 
NEDSRRIAAILNNAHYLENLHFTIEGRDTHYFIK • 
LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 
LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 
VLEIARQRAVAQAWTKEQRRLQBGEEGIRA WTE 
GEKQQLLSTGRVQGYDGYFVLSVEQ 


3326 


A 


290 


1041 

- • . ■'■ . ■ 


kaclhllssfltsnflfnpllpdslysvearsqra 
nlgpcrrkrlqtlmrlaagfqysshkdpslsak 
ekhtdyhneargpwpgwvg*r:tadgscgrgpd 
gahhpgpkssswrasrllpglggshhldayvgr 
dlecgtpaplqleippqprghpapiptgqagprds 
gpgasp*vetrpltdgrr*pgvrpvgwtpahpag 

TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 
AVPKHRAWRTPLCSQ 


3327 


A 


1 


418 


CSECGKSFCKKSKFIIHQRTHTGEBa>YECNQCGK 
SFCQKGILTVHQRTHTGEKPYECNECGKNFYQK 
LHLIQHQRTHSGEKPYECSYCGKSFCQKTHLTQH 
QRTHSGERPYVCHDCGKTFSQKSALNDHQBOHT 
GVKLY . 


3328 


A . 


1 


270 . 


VTRKLPIFrVDAFTARAFRGSPAADCLLENELDED 
MHQKIAREMNLSETAFIRKLHPTDNFAQRSCFGL 
IWFTPTTDLQILTSSILPSIL 


3329 


A 


45 


419 


EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 


3330 


A 


64 


430 . 


FWR2<IFTGLAPAAAVATTTSSSTN'!RFTSISNSLTST 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

■corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Asparfic Add, 
E=Glutamic Add, F=PhenyIaIanine, G=Glycine, H=Histidine, ■ 
I=Isoleudne, K=Lysine, I/=Leudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Trj'ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nudeotide deletion, . 
V=possible nudeotide insertion 










AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ . 

LLSRGFENLVPYTSTVSVVTTPVMTYGHLEGLIN 

EGNLELEIKRRLSSQATQ 


3331 


A 


3 


407 


TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 


3332 


A 


25 


461 


PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

KMTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

NNWPRINTLILRTNQQYLNLLSTSVTADAEDFS 

TFFFLDSQDKSA 


3333 


A 


317 


54 


AWnFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQmSSRRHEIVDPV 


3334 


A 


304 


410 


AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 


3335 


A 


19 


418 


VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRI 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSA\TAKNMYYLTQDDESnSAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYNPFF 


3336 


A 


1 


1003 


PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERLAGGATRDSAASDDLLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS . 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPED V ANH 

LTAFHWELFRCVHELEFVDYVFHGE 


3337 


A 


444 


43 


KILLCLANQFPDISFCPALPAWALLLHYSIDEAE 
CFEKACRILACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 


3338 


A 


1 


398 


FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKLVNNADKQQAGRIKQVFEKKNQK 


3339 


A 


1 


665 . 


AAAASNWGLITNTVNSIVGVSVLTMPFCFKQCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKMLVETSMIGLMLGTCIAFYV 

VIGDLGSNFFARLFGFQVGGTFKMFLLFAVSLCI 

VLPLSLQRNMMASIQSFSAMALLFYTVFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

FACQSQVLPTYDSLDEPSV 


3340 


A . 


198 


367 


LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREKITEMP 


3341 


A 


562 


277 


HSVIKRTPRKYLAEIVLIDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 
VLIYLDAHCEVAVNWYAPLVAPISKDR 




A 
A 




2 


NL 1 W WPLhKUV bl' YIVDLIMLIIFFLDNVIMWWE 
SLLLLTAYFCYWFMKFNVQVEKWVKQMINRN 
KWKVTAPEAQAKPSAARDKDEPTLPAKPRLQR 
GGSSASLHNSLMRNSBFQNKIHTLDPHV 


3343 


A 


1 


385 


FRVDNSEEWKDVFnSSERSFKLDSLKCGTWYKV 
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SEQID 
NO: 


Method 


Predicted 

begiDiiing 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

sequence 


Predicted end 
. nucleotide 
locatioa 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


. Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I=IsoleUcine, K=Lysine, L=Leucine, M^Metiiionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /°^ssible nucleotide deletion, 
V=possible nucleotide insertion 










THINSTHARL>nLQGWNNGGCPITArVLEYRPKGT 
WAWOni RAN9^nFVFT TFT RFATWV 


3344 


A 


351 


147 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3345 


A 


351 . 


,147 . 


SPACITSSLSQHIADPRAAPTEVKVRVMNSTAlSL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 


3346 


A 


3 


1509 


AGIRHEAPPTTSNRHRRQIDRGVTULNISGLKMP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMyWSDWGNHPK 

lETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKISVIGSIRLNGTDPIVAADSKRGLSHP 

FSroVFEDYIYGVTYINNRVFKIHKFGHSPLVNLT 

GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRC? 

TGFTGPKCTQQVCAGYCANNSTCTVNQGNQPQ 

(^POT p/^PT r^T~\p i^r\vT>^r^c/^vi^'cxTc/^T'/^/^Ti>r A at^ 
(-KUL.r ur HjJJKUl^ Y Kl^Uoo I CJblNr u 1 CCjMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACWNK 

QSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 

SILIP 


3347 


A 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITGVSHRARPENGFENIF 


3348 


A 


1 


1171 


LSKITMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GETYELVRDDLGFRLISEQVSHHPPISAFHAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVTINir/GKLWIEQYGNVEIINH 

KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 

KLCALYGKWTECLYSVDPATFDAYKKNDKKNT 

iiiiJsuvJNaJS.^jMi> 1 btliLJJJbMr VrJJbbiVrlirObVLL 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTDCRLRPDIRAMENGEEDQASEEKKRLEEKQ 

fv-ft-fviviSJN i\.oivoE.jDU W Is. 1 iv W r rll^vjr IN 1^ Y N O Al^Jj 

WIYSGSYWDRNYENLPDIY 


3349 


A 


403 


497 


NFASSSGKYLRTQKKCLNNKFTPFPTTEKK'SQS 
VRPP*S^fRIY*ILQS*NISFS*LPN*NFASSSGKYLR 
TQKIKGLNNKFTPFPTTEKK 


3350 


A 


1 


712 


GAPAQDCICLPFPFHSSFLESDIRKPARRKIQTTNP 
DFLLLLFMSVPVVSAPPFCPPAEGSRDGRPKASV 

APPA A VHTHMTJQPPT^rrjm PT^\7TPCCT /^/^"Xiyr^DTJ*!! 
/vi\r/vft. V nilrlxlorjNJJCLrni^Jr V JjvooLvjvj Wtsirrl r 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 
GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 
SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 
LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 


3351 


A 


I 


428 


"KA A A VV A A TA T T/TJPri A PM A P^/T P /^TT A A 'T A XTl/' 

ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 
GKNPMKAVGLAWAIGFPCGE.LFILTBtREVDKDR 
VKQMKARQNMRLSNTGEYESQRFRASSQSAPSP 
DVGSGVQT 


3352 


A 


2 


841 


RTLFRGRRJRREDDRISRPHPSTAESKAPTPKFDLL " 
ASNFPPLPGSSSRMPGELVLENRMSDWKGVYK 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

conrespoDding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence - 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E^lutamic Add, F=Phenylalanine, G=Glydne, H=Histidine, 
I=IsoIeudne, K=Lysine, L=Leucine, M=MetbioDine, 
N=Asparagine, P=Proline, QM^Iutamine, R=Arginine, S=Scrine, 
T=Threoninc, V=Valine, 'W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A^ossible nucleotide ddetion, 
V=possiblc nudeotide insertion 










EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTIP 

VSPPSTTXPSRASTASPCNNNINAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNWSPT 

.KNEDNGAPENSVEia»HEKPEARASKDYSGFRGN 

nPRGAAGBOREQRRQFSHRAIPQGVTRJElNGKEQ 

YVPPRSPK 


3353 


A 


1054. 


587 


lATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 

PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 

SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 

THLGTIPVPKGKPLALVEEIRNRKDVKVFNVTKE 

NRNHLLPDIVTCVQSSRK 


3354 


A 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEVVER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGPILEGHCLVRWAEELEm'RILP 

HTVLYMADSETFISLEECRGHKRARKRTSMETA 

LALEKLFPKQCQVLGIVTPGIWTPMGSGSNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRWLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGWGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK . 

GNVEADAFRKPFPSVPLFGFFGNGEIGCDRIVTG 

OTE.RKCNEVKDDDLFHSYTTIMALIHLGSSK 


3355 . 


A 


1 . 


707 - ; 


GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 

SRLLRERTVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSRIMIH 

QPSGGARGQATDIAIQAEEIMKLKKQLYNIYAKH 

TKQSLQVIESAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 


3356 


A 


352 


338 


FNYNFCRNLHMPSFLV*PGMCGLLAKHLSFHIVG 
AFLIT/LGVAALCKFAVA*PRKKAYADFYRNYN* 
IKEFEVRKANISQSTK 


.3357 


A 


1 . . 


403 


ALGSCGGLLGTGLLKGTMSGTLWSKGIFAGYKR 
RIRIQREHTAVLKIEGWYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGISI 
MVRAQFRTha,PADAIGHRIRMML*PSRMYTTEPS 


3358 


A 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 
, VMDSERQVKDTDDIESPKRSIRDSGYIDCWDSER 
SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDWL 
RGSSDGRGSDSESDLPHRKLPDVKKDDMSARRT 
SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 
AEREEYRKSWSTATSPAGLGKKALQPYGPRT\PV 
S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 
LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 
KEEERKKMEKLLAGEDGTSERRKSKTYREIVQE 
KERRERELHEAYKNARSQEEAEGmQQYIERFTIS 
EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 
YLRQQSLPPPKFTATVETnARASVLDTSMSAGS 
GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 
SQMFEGVARVHGSPLELKQDNGSIEINIKKPNSV 
PQELAATTEKTEPNSQEDKNDGGKSRKGNIELAS 
SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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SEQID 
NO: 


.Method 


Frcdicted 
beginniag 
nucleotide 
location 
. corresponding 
to first amino 
acid residue of 
peptide 
sequence 


I'redicted end 
nucleotide 
location 
correspond! ng 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanlnc Gi<;iycine, H=Histidine, 
I^lsolcudne, K=LysiDe, L=Leudne, M=Metbianine, 

/»3|jai dgiiicj r— rroiinc, vj^jriuiaminc, iv— Arginine, S^SerinCj 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X<=Unknown, *=Stop codoa, /=possible nudeotide deletion, 
V=possible nudeotide insertion 










KDQKKPENEMSGKVELVLSQKWKPKSPEPEAT ' 

LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQKEVEEEERRYYEEEP* 

mEDPWPFTVSSSSADQLSTSSSMTEGSGTMNKI 

DLGNCQDEKQDRRWKKSFQGDDSDLLLKTRES 

DRLEEKGSLTEGALAHSGNPV^KnVTrPDHnT nx 

EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 
KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMIIETLNLYFHIQCFRCGUCKGQLGDA 
VSGTDVRIKNGLLNCNDCYMRSRSAGOPTTL 


3359 


A 


3 


368 


EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 
AWAGRATSM*TSSySSEYQPQTP*ALVTLPPRSY 
YLLTHLLTLTHLHHQILFEP 


3360 


A 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAVA^RKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMKV 
KSPEIISRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 


3361 


A 


4619 


532 


LLLGRANSPPYNSWRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPBCPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE . 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL 

MMMVKEKMHIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTRRVRKKLIRVEBMKKP\STEGGEE 

HVFENSPVLDERSALYSGVHKXPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVnCSPTASRISLGKKVKSVKET 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDUDIISKPPMGTWMG 

LLNNKVGTFNFIYVDVLSED\EEKPKRPTRRRRK 

GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSKNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVL>J\KNRRKPPSFPSCRSC\ETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPQKLTTKKLEGSIAASGRGLSPPQCLPKlsfY 

DAQPPGAKHGLARTPLEGHRKGHEFEGTHHPLG 

TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTENKL\ 
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SEQED 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
~ corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Add,. 
EXSIutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=IsoIencine, K=Lysine, U=Leucine, M=Metfaionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










HAEGIRSSRREPYS*LRHGRCGI\P\EALVQRYAED 
LDQPERDVAANMDQIRVKQLRKQHRMAIPSGGL 
TEICRKPVSPGCIS\SVSDWLISIGLPMYAGTLSTA 
GFSTL\SQVPSLSHTCLQEAG\ITEERHIRK\LLSAA 
RLFKLPPGPEAM 


3362 


A 


1 


4653 


FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 
SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 
SISVEEGKENBLHVSENVIFTDVNSILRYLARVAT 
TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 
TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 
NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 
VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 
MGKVTVRFPPEASGYLfflGHAKAALLNQHYQV 
NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 
HIKPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 
TPGEQIKAEREQRffiSKHRKNPIEKNLQMWEEMK 
KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 
IQPHPRTGN*Y\NV\YPTYDFACPIVDSIEGVTHAL 
RTTEYHDRDEQFmnEAlGIRKPYIWEYSRLNL 
NNTVLSICRKLTWFVNEGLVDGWDDPRFPTVRG 
VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 
WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 
EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 
TFSEGEMVTFmWGNLNITKIHKNADGKnSLDAK 
L>n.ENKI)YKKTTKyTWLAETTHALPIPVICVTYE 
HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 
LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 
PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 
APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 
VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 
TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 
VAAQGEVVRKLKAEKSPKAKTNEAVECLLSLKA 
QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 
TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 
AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 
KKKEKENKSEKQNKPQKQNDGQRKDPSKNQGG 
GLSSSGAGEGQGPKKQTRLGLEAKKNEENLADW 
YSQVITKSEMIEYHDISGCYILRPWAYAIWEAIKD 
FFDAEEKKLGVENCYFPMFVSQSALEKEKTHVA 
DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 
YAKWVQSHRDLPIKLNQWCNWRWEFKHPQPF 
LRTKEFLWQEGHSAFATMEEAAEEVLQILDLYA 
QVYEELLAIPWKGRKTEKEKFAGGDYTTTIEAF 
ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 
EKQFAYQNSWGLTTRTIGVMTMVHGDNMGLVL 
PPRVACVQVA/IIPCGITNALSEEDKEALIAKCNDY 
RRRIiLSVNIRVRADLRDNYSPGWKFNHWELKG 
. VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 
EAETKLQAILEDlQVTLFTElASEDLKTHMWANT 
MEDFQKJLDSGKTVQIPFGGEIDCEDWIKKTTARD 
QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 
KNPAKYYTLFGRSY . 


3363 


A 


3797 


1514 


LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 
IJ^APKETDCVLTQK\LI\ETLKPFGGFLKKEEGTA 
SRRNFNFGKN* INLVKEWIRRNQ*KAKNLPQS VI\ 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

-to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
.location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=G.lutamic Acid, F=Phenylalanine, G=Glycine, H==Histidine, 
I— Isoleucine, K^^Lysinc, L=Leucine, M^^Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=ArginiDe, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^r=possible nucleotide insertion 










ENV\GGKIFT/FLGSYRL/GEVHTKGAD1DGVCVF 

APRHVDRSDFFTVSFYDKLKLQEEVKDLRAVEEA 

FVPVIKLCFDGIEIDILFARLALQTIPEDLDLRDDS 

LLKNLDIRCIRSLNGCRVTDEILHLVPNIDNFRLT 

LBIAIKLWAKRHNIYSNILGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPnTPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEILLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESIORJLVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTMWVIGLVFI<aCTENSENLSVDLTy 

DIQSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNHVLQKKKKHSTEGVKLTALNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKBPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTDLSDIPALPANPIP 

VIKNSIKLRLNR 


3364 


A 


54 ■ 


3073 


SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHVVREKTGAEEQ/WKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTGIILQU'LQSDPYLSSVS 

fflVLDEIHERNLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPWEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLnPLHSLMPTVN 

QTQVFKRTPPGVRKIVIATNIAETSITIDDWYVID 

GGKDCETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWnQLPEIF/R 

GTPLEELCLQIKVLRLGGyOLFLSRLMDPPSNEA 

VLLSIRQL\RSLN ALDKQEELTPLGVHLARLP VEP 

mCKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTWNAFEGWEEA 

RRRGFRYEKD YCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKinCAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQIGDNDQETIAVDEWIVF 

QSPARIAHLVKRAWHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 


3365 


A 


439 


878 


ECa^VRPLRETDLLKMKRKPRASSPVVEEQPRA 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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SEQlD 
NO: 


Method 


Predicted 
beginning 

fi ti^ltfknf4H A 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 

ILPl^dllUll 

corresponding 
to last amino . 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
i~uuicucinc) IV — i_)ysincj ju i^uciiiC) ivi — ivjeuiioDinCf 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=nirtonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X°=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possibIe nucleotide insertion 


3366 


A 




827 


FRGYWGVREAFTOASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGRIQLREQLPRYLMGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHYFVATQDQNLSVKVKKKPGVPLM 

FnQ>rnviVLDKPSPKTlAFVKAVESG\RLSQCMRK 

KVSNISKRNRV**KTLNRGRRKKRKK1SGPNPLS 

CLKKKXKAPDTQSSASEKKRKRKRJKNRSNPKV 

LSEKQNAEGE 


3367 


A 


40. 


1467 


MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAGVQIGNACWELFCLEHGIQADGTFDAQASK 

INDDDSFTTFFSETGNGKHVPRAVMIDLEPTWD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAV\^PYNSILTTHTTLEHSDCAFMVDNEAriT)I 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDVWKDYNVAIAAIKTKRTIQFVDWCPT 

GFKVGINYQPPTVVPGGDLAKVQRAVCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF. 


3368 


A 


3 


2597 . 


slleetmdedsslreytvsldsdmddaskclqe 

ydsgtgntrealrpcprtvstkaqpgrsassssg 

dkttsfaeqkirklnhtdgessgsssqkttpegse 

lniphagawaqipeetglpqgrdttqllasemv 

hlmmk\lkekr\rar*aqkkkmeaaftkqrqkm 

grtafltwkkkgdgisplreeaagaedekvyt 

drakekesqktdgqrsksladikesmenpqakw 

lkspttpidpekqgnlaspseetlnegeileytksi 

eklnsslhflqqemqrlslqqemlmqmreqqs 

wvisppqpspqkqirdfkpskqaglssa1apfssd\ 

spr\pthpsstsllnrksasesvksqrtprpnelki 

tplnrtltpprsvdslprlrrfspsqvpiqtrsfvc 

fgddgepqlkeskpkeevkkeeleskgtleqrg 

hnpeekeikpfestvsevlslpvtetvcltpnedq 

lnqptepppkpvfpptapknvnlievslsdlkppe 

kadvpvekydgesdkeqfdddqkvccgfffkd 

dqkaendmAmkraallekrlrreketqlrkqq 

leaemehkkeetrrkteeerqkkederarrefir 

QEYMRRJECQLKLMEDMDTVIKPRPQVVKQKKQR 

PKSIHRDHffiSPKTPIKGPPVSSLSLASLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSR2>}GEBaDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHnQNAL 

AHCCLAGKVNEGQKKKILEEMEKSDANNFLILF 

RDSGCQFRSLYTYCPETEEINKLTGIGPKSITKKM 

ffiGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 

WQTKRPVTPKKLLPTKA 


3369 


A 


977 


594 


RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 


3370 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID " 
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SEQID 

Na 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glufaniic Add, F=Phenylalanice, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, Lr=Leudne, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, y=Tyrosine, 
X°=lJnknown, *=Stop codon, ^=possible nocleotide deletion, 
V=possible nucleotide insertion 










YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEBCEAG/G 

DLLNRMIVWKHGLLI 


3371 


A 


345 


1383 


DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 

ySAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEEIGICLLAKAEQLG 

AEGNVDESQKILMEVEKVRAKKKEAEKTVAEK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRDRHRRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 


3372 


A 


239 


3348 


PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 

MSDDVHSLGKVTSDLAKRRXLTS\*GGLSEELGS 

ARRSGEVTLTKGDPGSLEEWETWGDDFSLYYD 

SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 

EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 

KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 

GSSGPSEYMEVPLGSLELPSEGtLSPNHAGVSND 

TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 

CMATESVDGELSGCNAAILKRETMRPSSRVALM . 

VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 

DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 

QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 

TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 

PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 

LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 

DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 

VDKQQRTPLMEAWNNHLEVARYMVQRGGCV 

YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 

VNAQDSGGWTPnWAAEHKHIEVIRMLLTRGAD 

VTLTDNEENICLHWASFTGSAAL^VLLNARCDL 

HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 

ELKNKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAIRTEKnCRDVARGYENVPIPCVNGVDG 

EPCPEDYKYISENCETSTMNIDRMTHLQHCTCV 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

PLIFECNQACSCWRNCKNRVVQSGIKVRLQLYR 

TAKMGWGVRALQTIPQGTFICEYVGELISDAEAD 

VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 

HLCDPNIBPVRVFMLHQDLRFPRIAFFSSRDIRTGE 

ELGFDYGDRFWDIKSKYFTCQCGSEKCKHSAEAI 

ALEQSRLARLDPHPELLPELGSLPPVNT 


3373 


A 


587 


1584 


PDGRLWSCSEDKTKIWDTTNKQCVNNFSDSVG 

FANFVDFNPSGTCDIlSAGSDQTVKVWDVRVNKL 

LQHYQVHSGGVNCISFHPSGNYLITASSDGTLIOL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLILPQQQKPWGLCQTRV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Gliitamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, L=Leudne, M=Methionine, 
N'=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=lfnkno\vn, *=Stop codon, impossible nucleotide deletion, 
V^possiblc nucleotide insertion 










KRPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 

KVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 

LTLTEDKLKDCLENQQKLFSAVQQKS 


3374 


A 


398 


21 


WLYPMALSn^DIKMSPSWYFHMAIGIINWNTTAG 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSPITLFKIVHYWPGAVAHAYNPSTLG 
GQVG/WQ1T*GQEFETSLDYMVKPHLY 


3375 


A 


3 


1051 


VPTQQILAFPEQTNTKDWTVTPEHVLPESQSLLT 
FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 
ETVISLALFVLPKPKVISCLEQGEEPWVQVSPEFK 
DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 
VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 
DFPVKKRKKLSTWKQELLKLMDRHKKDCAREK 
PFKCQECGKTFRVSS\DL\IKHQRIHTEEKPYKCQ 
QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 
FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 
HLIKHRRTHTGEQPYTCSICRRNFSRRSSLLRHQK 
■ LHL*REACPVSHFWKTF 


3376 


A 


137 


2329 . 


SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCRKGALRQKWHEVKSHKFT 

ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVy 

HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 

TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 

SVEVWDWDRTSRNDFMGAMSFGVSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTDPKRCFFG 

ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 

ELYAnOLKKDVIVQDDDVDCTLVEKRVLALGG 

RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 

DLMYHIQQLGKFJCEPHAAFYAAEIAIGLFFLHNQ 

Gl^m)LKLD^^VMLDAEGmKITDFGMeK:ENVFP 

GTTTRTFCGTPDYIAPEnAYQPYGKSVDWWSFG 

VLLYEMLAGQPPFDGEDEEELFQAIMEQTVTYP 

KSLSREAVAICKGFLTKHPGEAPGASGP*WGNLT 

IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPANLTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM: : 


3377 


A 


918 


738 


SSMLWGFSVFRRSWILNCWLSSSQVGISAACICFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 


3378 


A . 


1126. 


456 


FSKLIMKTFnGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLFmTa>LDTIWNRSYFLTlPYEECKRilRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK*IRKLQGVI 


3379 


A 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEEBTDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSWSTDQESAEEIPILIIEG 

FLLFNYKPLDTIWNRSYFLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEW 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysleine, D=AsparUc Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-=TIireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
¥=possibIe nucleotide insertion 










RRMTTNPS/CK*IRKLQGVI 


.3380 


A 


1443 


794 


ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 

CEQDIYEAVTKINGMI 


3381 


A 


945 


.474 . • 


SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQTIAETEATYLKILESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 


3382 


A 


1 


1458 


GIRGKMADRGGVGEAAAVGASPASyPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALRIPLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSLISGLBFl 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWKNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E*LCHC]CSTG*GRSNNYCRC*KVI*TGTQGR 

KNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 

GNVCKKIRKNKRSSKNNERFDE*ISSSYHVEHP* . 

KSL\KSLLELQAYPDVQAVLAKYDDISLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 


3383 


A . 


282 


2443 


RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

BCEDMSEPQEKKLSENTDFLAPG VSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRWHAVDSCMSFQCKRGfflCKADQQGKT 

SLVSCQDPVTNCPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPT\CRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRNKVKKIYL\DEKRLLAGDHPIDLLLRDFK 

KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 

PLRASLVPMEHCITRFFEECDPNKDKHITLKEWG 

HCFGIKEEDIDENLLF 


JJb4 


A 

A 


3166 


mo- 

92b 


PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLVVSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQID 
NO: . 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A^AIaoine OCysteine, D==Aspartic Acid, 
E=Glutamic Acid, F=Phenylalamne, G=Glycinc H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, possible nncleotide deletion, 
V=possible nucleotide insertion 










SDVDQWDTAALADGDCSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEBa,LQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDMSREEVNEBCLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLnCVFHRDGHYGFSEPLTF 

CSWDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LRSRIA\EIHESRTAKL\EQQLLVPRASDNKRD/IDK 

PH*TSLKPDLMQLRKIRDQYLVWLTQKGARQKK 

INEWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SWVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 


3385 


A 


43 


2372 


TRDVNSWKELCFNHYNKETTNCYRTTRKWTNY 

KIIFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHHISHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAEDFTQEEWKRLDPAQRKLYRNVML 

*NYNNLITVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFAlsrVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKShJLIDHEIQHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEICPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFrraQKmTRE/KTFKOvIHCGKGFNQTLDLrRH 

LRIHTGEBCPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNHT 

HQKVHTGEBO'YDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC . 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNnTHQKIHTRENPLSVIIVEKASIRLWTSSDI 
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SEQDD 
NO: 


Method 


Predicted 

beginning 

nacleotide 

location 

corresponding 

to first amino 

add residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H=HIstidine, 
Msoleucine, K=Lysine, L=Leucine, M=Methionlne, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=^rine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possiblc nucleotide insertion 


3386 


A 


201 


1032 


WDD-\TQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPAWYCHSPSHSLLSAFCHLPTPSRCPAMAR 

PPVPGSWVPNWHES/EIRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSIDTASYKIFVSGKSGVGKT 

ALVAKLAGLEVPWHHETTGIQTTWFWPAKLQ 

ASSRVVMFRPEFWDCGESALJCKFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 


3387 


A 


86 


96 


GSSPDPASLITMKNQDKKNGAAKQSMPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVAKNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEItLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNEKNSKLRQENMELAERLKKLIEQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVK]QRLEKLCRALQT/GAQ*PVRGQRW 

GSHRTSAVRIFS 


3388. 


A 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 

NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDD 

PSGDNLYLGIVLAAWnTGCFSYYQEAKSSKIME 

SFKNMVPQQALVIREGEKMQVNAEEVVVGDL V 

EnCGGDRVPADLRnSAHGCKVDNSSLTGESEPQT 

RSPDCTHE\NPLKTRMTFFSNNFVEGTARGVWA 

TGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLIT 

GVAVFLGVSFFILSLILGYTWLEAVIFLIGIIVANV 

PEGLLAWTVCLTLTAKRMARKNCLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDMPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTNKYQLSMETEDP 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGV 

GIIFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQroEILQNHTEIVFARTSPQQKLnV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASIVTGVEEGRLI 

FDNLKKSIAYTLTSNIPEITPFLLFIMANIPLPLGTI 

mCEDLGTDMVPAISLAYEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKWEFTCHTAFFVSIWVQWADLIICKTR 

KNSVFQQGMKNKILIFGLFEETALAAFLSYCPGM 

DVALRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide . 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Ammo acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PhenylaIanine, G=Glycine, H==Histidine, 
I=Isoleudne, K=Lysine, L=Leudne, M^Mcthionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=?Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LRRNPGGWVEKETYY 


3389 


A 

■ ■ 


45 


5250 


VERLLGCRNSKRTWRMLISKNMPWRjElLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTErVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSWRKEHNS 

KLTITFPAMVHRTAGQKDSEPLGffiEAQIGKRGY 

LTPTSAREHLSALWKNEGFFLNYLFSGMDDDGM 

ESRFNPSVFFLDFLWPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

WIRLQSHVNTVFDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYlNfADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV . 

STLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGX^DILVKPKADVICRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMIDLKFKEEVNHYSNEINKACMPFGLHRQFPE 

NTLQLMVQSGAKGSTVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCinC 

HLEGLWQYDLTVRDSDGSWQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALKLESENRNGR/RPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDFYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQL\KWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

NIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPEDILRFMETRFFKLLMESIKKK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVD YESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEEVGL/GH*GGPVPSRPPDAAPETHP 

QPGAPGA\EAMERRVQA VREIHPFIDDYQYDTEE 

SLWCQVTVKLPLKOONFDMSSLWSLAHGAVIY 

ATKGITRCLLNETTNNKNEKELVLNTEGINLPELF 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELFCSPSACLVVGKWRGGTGLFELKQPLR 


3390 . 


A 


2 


2080 


ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDElDAYWLELINSELKEMERPELDELTLERVLE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystelne, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G^Giycine, H=Histidine, 
I=Isoleudne, K=Lysine, Ir^Iyeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=TryptDphan, y=Tyrosine, 
X=^Unknown, *==Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










ELETLCHQNMARABETQEGLGIEYDEDWCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYGE.KVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALWIPEVSIGCPEKMEPITiaSHIPASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEMRTELADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDFIYQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRNLCYMVTRRERTKHAICKLQEQIFH 

LQMKLffiQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVP\GPAASPKPLG 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 


3391 


A 


1555 


327 


NSFLHFLHLKVRTMFLFPSFPVLLLSVVTASCSKT 

KACADTQKTCSMITCGIPVTNGTPGRDGRDRPK 

GEKGEPGLGQVSVAS*ISTSGRCSSKSVLEPATRG 

LKHRLGEAPLSSGPMLHSEQPL*NA1ASKTKLFV 

DSLGSfflSTQELGVCGCPFRGVSCLVGELALVQA 

LH*VAGESFFFGSDHWLIGCAGGEQEWSIELLGK 

KKRVTATGSSSLCLATGQGLRGLQGPPGKMGPP 

GNTGTSGIPGPRGQKGDRGDNSVAEAKLANLER 

KL*SLRSELDHTKKL*PFSLGK\MSGKKLFVTNGE 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 

AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKK 

DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 

FPA 


3392 


A 


218 


1773 


GGSRRNQRRSIPVLGYFLKQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLBCSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

HEQMPTEIEFPESRKPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKRIHTGKKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSEGGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNIHQRTHTGEK 

PYGCIDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLlRHQKfflSGEICPYKCSDCGELAFL 

TKTMLIVHHRTHTGERPYGCDECEKA YFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS**TC 

AGEKPC*YGNCGNGGRAV 


3393 


A 


46 


1464 


ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSBETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNHVIFKKISRDKSVIMYLGNRDYMDHVXSQV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparlic Acid, 
EKilutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleudne, K^Lysinc, L^Leudne, M^^Metfaionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Trypfophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nudcotide deletion, 
Vpossible nudeotide insertion 










LIRKVQHAPLEMGPQPRAEAAWQFFMF\DKPLH 

LAVSLNKRDLFPMGSP]PVPVSVP\NNTEKPVKKI 

KA\SVEQVANVVLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLV]vrNRERRGIALDGKIKH 

EDTNLASSTIIKEGIDRKRSWEILVSYPDQR*SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQDANLVF\EEFARP*ILKDAGEA*\EGKRDQE 


3394 


A 


211 


1591 


rpptmaadqrpkadtlalrqrlissscrlffpedp 

vkivraqgqymydeqgaeyidcisnvahvghch 

plwqaaheqnqvlntnsrylhdntvdyaqrls 

etlpeqlcvfyflnsgseandlalrlarhytgh 

qdvwldhayhghlsslidispykfknldgqke 

wvhvaplpdtyrgpyredhp\thvedglekafs* 

krvvqgrkirqicrrqiaaffaeslpsvggqiippa 

gyfsqvaehirkaggvfvadeiqvg'fgrvgkhf 

wafqlqgkdfvpdivtmgksignghpvacvaat 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLKDEATRTPATEEAAYLVSRL 

KEhPfVLLSTDGPGRMLKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 


3395. 


A 


1 


1424 


FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

PITERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LICALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCR.TRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLSIPAA 

TWGHPLTLDHCLHHFISSESVRDWCDNCTKEEA 

KGTLNGEKVEHQRTTFVKQLKLGKLPQCLCIHL 

QRLSWSSHGTPLKRHEHVQFNEFLMMDIYKYHL 

lghkpsqhnpklnknpgptlelOdgpgaptpgl 
nqpgapktqifmngacspsllptlsapmpfplpv 
vpdyssstylfrlmgscrppwetwhsgtlcsftd 

GPHL 


3396 


A 


109 


107 


tqeagliffsppfslslslslplslfllshphsrtpp 

nrtprrtripqrpavmysplcltqdefhpfieall 

phvrafaytwfnlqarkrkyfkkhekrmskee 

eravkdellsekpevkqkwasrllaklrkdikp 

eyredfvltvtgkkppccvlsnpdqkgkmrrid 

clrqadkvwrldlvmvilfkgiplestdgerlv 

kspqcsnpglcvqphhigvsvkeldlylayfvh 

aadssqsespsqak*r*h*gparkwdiwgfq\ds 

fvt\sgvrsvt*a*lrvsqtpi\aag\tgpnfslsd 

lesssyysmspgamrrslpstsstsstkrlksved 

emdspgeepfytgqgrspgsgsqssgwhevepg 

mpspttlkkseksgfsspspsqtsslg\taftqhhr 

pvitgtqskfhiatpsil\hfprhspffqqpgpyfsh 

pairyhpqetlkefvqlvcpdagqqagqpngss 

qgkvhnpflptpmlpppppppmarpvplpvpdtk 

ppttsteggaasptspttrs/pgr'mpqqpfl/syg 

pp*psnaliggggggagerageradlem 


3397 . 


A 


1 


2002 


tgtltedgldvmgvvplkgqaflplvpeprrlp 
vgpllralatchalsrlqdtpvgdpmdlkmves 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
.location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnyIalanine, G=GIycine, H=Histidine, 
I'^Isoleucine, K=Lysinc, l/=Leucine, M^etbionine, 
N=Asparagine; P=Proline, CH^lntaraine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=TjTosine, 
X=Unknown, *=Stop codon, A=^ossible nucleotide deletion, 
\ppossible nucleotide insertion 










TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSVWAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRVVALASICPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRIRA 

VMVTGDNLQTAVTVARGCGMVAPQEHLnVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGnVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASWSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDLQFLAIDLVITTTVAVLMSRT 

GPALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 

TVVFSLSSFQYLEAAAVSKGAPFR\RPLTNNVPF 

LLASAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 

KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 

AQAG\SKKRFKQLERELAEQPWPPLPAGPLR 


3398 . 


A . 


758 


1368 


FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 
. KRRRRPGRGPWPAPGGQGVGPSAL*KAGSPPAN 
RPGQGE/PGLISPKPVTEVLPDVQGAPVPVPPLPT 
PPSLPHLQNQPP/TVQHYLLSFSWKPSQGPE*RA* 
PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 
TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 
SFL 


3399 


A 


906 


1091 


HHHHHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS ~ 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 


3400 


A 


1838 


325 ... 


PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGEIASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KFYNFVE.HARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAniLLLT\SN 

\FDCR\LSLHQVNQAMMSNLmQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

NTFKPHRLQARKAMWRKEQDTRALKEQSQHLD 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQP\LIIHHAQh'IVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 


3401 


A 


153 


1389 


ewgwlgaaqppeeeaeaedqespsslcrealaei ■ 
kkeisplhgmekcsvgglelteqtpallgnmam 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFEESIQPPSISAPAIADQROTIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETTVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYnTNVTTLETGISSVNA 

GQDVNmTYKTSL*NTNLGDVAKGLQSSNFGVNI 

NLQLII/CPEDASTKKA>fV]LPVESSKSFQEFYSTS 
CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 
PKLLFRLTVnLTFKCYYVLFHLHNARVLDV 


3402 


A 


153 


1389 


EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI " 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide . 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F'=Pbcoylalamne, C=GIycinc, H=Histidine, 
I^lsoleucine, K=Lysioe, L^Leucine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknon'n, *=Stop codon, /^possible nucleotide deletion, 
possible nucleotide insertion 










KKEISPLHGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

VNAGMGNSGITTELTLKYnTNVTTLETGISSVNA 

GQDVNinTYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

mQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCTICSKLAEVWIFI 

PBCLLFRLTVIILTFKCYYVLFHLHNARVLDV 


3403 


A 

/ _ 


609 


2765 


SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KKAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMYIHPDSPATGEQWMSKWTFHKLKL 

TN>nSDKHGFtlLNSMHKYQPRFHIVRANr)ILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGKREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDF\SPSRG*RATPEAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

ASAAAAQGLSGASAATLPFHLQQHVLASQGLA 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 

P\FNLNTMRPRLRYSPYSIPVPVPDGSSLLTTALPS 

MAAAAGPLDGKAAALAASPASWAVDSGSELNS 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 

GLEAKPDRSRSASP 


3404 


A 


1082 . 


1308 


LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
FIIKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 
SHSSP 


3405 


A 


1553 


559 


PRPPTQRLSRFAPPCRTAEFPFRRRAWTRPAPPR 

ACTWGRSSPVTGLAVG AA VAMLTVAARSRPFA 

PVLSATSRGVAGALT\P*MQATVPATPEQPVLDL 

KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTDKVPDFSEYRRLEVLDSTKSSRESSEARKGFS 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 

LALAKffiIKLSDIPEGKNMAFKWRGKPLFVRjEI[l.T 

QKEIEQEAAVELSQLRDPQHDLDRVKKPEWVILI 

GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 

RIRLGPAPLNLEVPTYEFTSDDMVTVG 


3406 


A 


83 


2671 


CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKKWKDQNTEYEYQNPRRNFRSVT 

EEKVNEIKEDSHCGETFTPVPDDRLNFQKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKNnYHSSIQRHMWHSGDGP'i'K 

CKFCGKAFHWLSLYLMERTHTGEKPYECKQCG 

KSFSYSATHRIHERTfflGEKPYECQECGKAFHSPR 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino - 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AJanine C=Cysteine, D=Asparlic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 

T=Tcfi1^iirinp Ws:T vcinp T ^tirin^ IVf^szTi^pfhinninA 

N=Asparagine, P=Proline, Q^^lutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFR^HERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYEOCECGKAFIF 

VNNLQSHERTQTHIRIHSGERRYKCKICGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTHNGEKPY 

ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

NVAKLSLLPVLFNIMKEFTLGRNPISVSNVRKPLF 

LPLLFNIMKGLTWERNPMSVCHVGJCPSFLLVPFN 

IMKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRPFEYR 


3407 


A 


1426 


3 


PAAPSGASPGRVCGVETARPLGVQRRQS ADEGP 
PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 
PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 
GSLLPVKIIETDFEKAHRSKKILSLGNTFGGGVFL 
ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 
TELLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 
RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 
. ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 
EEGEKWSLFVGVAVHETLVPVALGISMAGSAM 
PLRDAAkLAVTVSPMIPLGIGLGLGEKAQGVPG 
SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 
DRLLKVLFXLWGYTVLAGMGLPQVVSGLAIVPA 
AGSPPGAPGRTQAASPGRASPKSEHGGPGPPPVH 
KGPPGTRLCPRSYTLSLRALLLFKJLLSLKSLYQK 
KK ■ 


3408 


A 


106 


4514 


EARDRLAQSRAKEKELNSVASELSARQEESEHSH 
KHLIELRREFKKNVPEEIREMV APVLKSFQ AEVV 
ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 
PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 
LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP : 
SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 
LLELRRKYDEEAASKADEVGLIMTNLEKANQRA 
EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 
KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 
SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 
QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 
PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 
DSIKDSLGTEQS\TSPQQLPPPPGPEDPLSPSPGQP 
LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF-. 
KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 
GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 
QLLKHNIGQRVFGHYVLGLSQGSVSEILARPKPV 
"WRKLHG**GKEPFIKMKQFLSDEQNVLALRnQV 
RQRGSITPRIRTPETGSDDAIKSILEQAKKEffiSQK 
GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 
, REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 
ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 
QSnRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding . 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=<;iutamic Add, F=PlicnyIalanine, G=Glydne, H=Hislidinc 
I=Isoteucine, K=Lysine, L=Leudne, M^Methionine, 
>f=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide ddction, 
\=possible nudeotide insertion 










vspslssssssgysgqpngrawprgdeapvpped 

eaaagaedepprtgelbcaegataeagarlpyyp 

ayvprtlkptvppltpeqyelymyrevdtleltr 

qvkeklakngicqrifgekvlglsqgsvsdmlsr 

pkpwskltqkgrepfirmqlwlsdqlgqavgqq 

pgasqaspteprsspspppsptepekssqeplslsle 

sskenqqpegrsssslsgkmysgsqapggiqeiv 

amspeldtysitkrvkevltdnnlgqrlfgesil 

gltqgsvsdllsrpkpwhklslkgrepfvrmql 

wlndphnveklrdmkklekkaylkrryglist 

gsdsespatrsecpspcLqpqdlsllqikkprvvl. 

apeeicealrkayqlepypsqqtiellsfqlnlkt 

ntvinwfhnyrsrmrremlvegtqdepdldpsg 

gpgilppghshpdptpqspdsetedqkptvkelel 

qegpeenstplttqdkaqvrikqeqmeedaeee 

agsqpqdsgeldkgqgppkeehpdppgndglpk 

vapgpllpggstpdcpslhpqqbseagerlhpdp 

lsfksasessrcslevslnspsaasspglmmsvsp 

vpsssapispsppgappakvpsasptadmagalhp 

sakvotnlqrrhekmanlnniiyrleraanree 

ALEWEF 


3409 


A 


162 


1710 V 


gplspgpyqcrpslpaqlypqslmaaatlrtptq 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLIKHQRIHSG*\RPYECTECGKSF 

SQNSSLIEHHRVHTGERPYKCSECGKSFRQRSAL 

LQHRGVPTGERPYECSEGGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRIHSR*KPYE\CKCG 

N\R*HPGESP*VHSECQ/KSFS*RPYLIECHTVHKG 

KTLLICRDVQLI 


3410 


A 


167 . 


789 


LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TAWGSPHPEAVLQLEVAPESSGPCTDTAKDQQS 

DKLPDLMPPA\EPLGSALELRASLEIDVAE\RGCE 

HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 

Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 

V/GEQEBCEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

LGFWAICP 


3411 


A 


1040 


887 


ASLSKPAGISTMPWALILLFLLTHSAVS WQAGL 

TQPPSVSKDLR\QTATLTCTGNSNNVGHQGVIWL 

QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 

ASLTIYGLQTEHEAD**CRPRRKLIPKTARLFFFFL 

IDNEEYLLRVY 


3412 


A 


164 


83 


RRGIPGSASLSLTMCVRSCFQSPRLQWVWRTAFL 

BOrTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H<=Histidine, 
I'^Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine, W=Tryptophan, V=Tyrosine, ■ 
X=Unknown, *=Stop codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion 










DPRIYKLCLEQLGLQPSESIFLDDLGTNLKEAARL 

GmTIKV>roPETAVKELEALLGFTLRVGVPNTRP 

VKKTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQS>fPTYYIRLANRDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLma)PSLPGLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWVA'SSKRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGBPAAEEYFRMYCLQMGL 

iPPTENWNFYMAFSFFRVAAILQGVYKRSLTGQA 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W»GGRSGRTSWRLLALGCHT 


3413 


A 


105 


1573 


PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 
DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 
SPALGSTKEFRRTRSLHGPCPVnTGPKACVLQN 
PQTMHIQDPASQRLTWNKSPKSVLVUCKMRDAS 
. LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 
ESFGAVKXKFCTFREDYDDISNQEDFnCLGGDGT 
LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 
SQVTQVIEGNAAVVL/RGSRLKVRVYKELRGKK 
TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 
NEWTDRGPSSYLSNVDVYLDGHLITTVQGD/G* 
GPQHLSWGP*AFLGRE*RLRLSLSGVrVSTPTGST 
AYAAAAGASMIHPNVPAIMITPICPHSLSFRPIW 
PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 
DSlSITTSCYPLPSICVRDPVSDWFESLAQCLHWN 
VRKKQAHFEEEEEEEEEG 


3414 


A 


20 


2602 


VIVNKNVNWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNlKAPHAV\ai-MNTKGHHWLT 

NARLTKYQSLPCENPHITIEVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAVVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRIGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELVITNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

WDEFPAQKNHPDNFWVLKASnRQYYIARVEKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLP\ASSCVIGTI 

KPSFFLLSIKTGELLGFPVYASR\KSIAIRN*NNDK 

WPPERQQYYGPAT^AQDGSWGYRIPIYMINRnRL 

QAVLKHTATGRALTDLAQQETQMRNAn'QNRLA 

LDYLLAAEGEVCRKFNLTNCCLHIDNQGQWED 

IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 

GGFKTLHRVnVIGTYLLLPRLLPVLLQMIKSFlAT 
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wo 01/57190 



PCT/CSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

Iccafioa 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino odd sequence (A=Alaaine .C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=Plicnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










LVYQNASAQVYYINHY. 


3415 


A 


455 


108 


NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 


3416 


A 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDWQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 


3417 


A 


243 


847 


CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPR\KS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRARGRCARDGLSLQTRLTNKYDIQCCQE 

MRK 


3418 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTBCRKNI 
RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP • 
PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 
LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 
LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 
LHHFRPDLmYKSLNPQDKENNKKAYDGFASIGI " 
SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 
ELNVVQIEENSSKSTYKVGNYETDTNSSVDQEKF 
YAELSDLKREPELQQPISG AVDFLSQDDSVFVND 
SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 
QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 
RLLKAETLELSDLYVSDKKKDMSPPFTCEETDEQ 
KLQTLDIGSNLEKEKLENSRSLBCRSDPESPIKKT 
SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 
. ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEQARRDAALKAGNKHNTNTATPFCNRQLSbQ 
QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 
MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 
GGGDELTNLENDLDTPEQNSKLVDLBCLKKLLEV 
QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 
ERLQKTTERFRNPWFSKDSTVRKTQLQSFSQYI 
ENRPEZ^OCRQRSIQEDTKXGNEEKAAITETQRKPS 
EDEVLNKGFKDSVSQYVVGELAALENEQKQEDTR 
AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 
\mKKNALIRRMNQLSLLEKEHDLERRYELLNRE 
LRAML AffiDWQKTEAQKRREQLLLDELVALVN 
KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 
KMAKKEEKCVLQ 


3419 


A 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 
AAKITELINKLNFLDEAEBCDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue ot 

peptide 

sequence 


Predicted end 
' nucleotide 
location 
corresponding 
to last amino 
acid. residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid,'F=Phenylaianine, G=Glycinc, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, (>=Glutamine, R=Arginine, S=Serine, 
T=ThreoniDe, V=ValiDC, W=Tryptopban, V=Tyrosine, 
X=Unknown, *=Stop codon, ^=possibIe nucleotide deletion, . 
\Fpossible nucleotide insertion 










FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGY 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTXNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLDDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVO-LAIPDKLTVMTYLYQIRAHFSGQ 

ELNWQffiENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPnCEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKIERARVL 

LEQARRDAALICAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFW 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDSVSQYWGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNALIRRlvlNQLSLLEKEHDLERRYELLNRE 

LRAMLAJEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKEEKCVLQ 


3420 


A 


612 


1058 . 


ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALIITTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDiBLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 


3421 


A 


23 


2005 


LLTPCDGRIPGRPSVGAESGSDFQQRRRRRJRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEAILAKKRKVLEFERVYLDNLPSASMYERS 

Y^4HRDVITHWCTKTDFIITASHDGHVKFWKK[E 

EGJEFVB^^RSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVNFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMffiYWTGPPHE 

YKFPKNVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRIFRFVTGBa.MRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFVLYGTMLGIKVINVETNRCV 

RILGKQENIRVMQLALFQGIAKXHRAATTIEMKA 

SEl^V^QNIQADPTIVCTSFKKNRFYMFTKREPE 

DTKSADSDRDVFNEKPSKEEVMAATQAEGPKRV 

SDSAIIHTSMGDIHTKLFPVECPKTVENFCVHSRN 

GYYNGHTFHRIDCGFMIQTGDPTGTGMGGESIWG 

OtrlllJJ&rrlall^KrUJKr I I l^oiVlAN AubN 1 NOSQFF 

ITWPTPWLDNKHTVFGRVnCGMEWQRISNWK 

VNPKTDKPYEDVSDNITVK 


3422 


A 


2486 


433 


FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ 
DFQLRNLRIBEPNEVTHSGDTGVETDGRMPPKVT 
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wo 01/57190 



PCTAJSOl/04098 



SEQm 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide. 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalaninc, G=Glycine, H^Histidine, 
I— Isoleucine, K^Lysine, Lr=Leucine, M^Methionine, 
N=Asparagine, P=ProIine, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threoninc, V=VaIine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










SELLRQLRQAMKNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTAITTEEHAAMWTD 

GRYIT.QAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLDPTDYWKKMAKVLRSA 

GHHLIPVKENLVDKIWTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLKMAERNVMWFWTALDEI 

AWLFNLRGSDVEHNPVFFSYAHGLETIMLFroGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICIAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMTVTDEPGYYEDGAFGIRIENWLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 


3423 


A 


5515 


934- 


FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRiiGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDffiRPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILAINHTPLDQNiSHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTTVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGfflQVNDKIVAVDGVNIQGFANHDWEVL 

KNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTWE . 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSrVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASViDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKTVELVKDCKGLGFSILDYQDPLDP 

■mSVrVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGTIHDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEffiQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMELYPLSHIQEATPVPSVNELHFGTQWL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMIPND VQGPSLLIDLPV VAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRTVEIFREPNVSLGISrV 

GGQTVKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

WFTVQSLSSTPRVffNVHNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phcnylalanine, G=Glycine, H=Hi5tidine, ■ 
I=Isoleucine, K=Lysine, L=Leucine, M=MethioniDe, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, ' 
V=possible nucleotide insertion 










DAFTDQmQRYADLPGELHIIELEKDKNGLGLS 
LAGNKDRSRMSIFWGINPEGPAAADGRMHIGD 
ELLEINNQILYGRSHQNXASAIIKTAPSKVKLVFIR 
NEDAVNQMAVTPFPVPSSSPSSEEDQSGTEPISSEE 
\DGSLE\VGIKQLPESESFKLAVSQMKQQKYPTKV 
c SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 
PATCPIVPGQEMIIEISKRRSGLGLSrVGGKDTPLV 
NGVDLRNSSHEEAITALRQTPQKVRLWYRDEA 
HYRDEENLEIFPVDLQKKAGRGLGLSIVGKR 


3424 


A 


2223 


1162 


HASERWQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLS YS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKPXGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFC\DS 

C\SDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 


3425 


A 


2223 


1162 


HASERWQLPDFVWDQYTHSLGRVEREFKNRKR 
HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 
MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 
WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 
. NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 
YSKKSSTSRRQHPLNKHLFKPXGTFMTSHEPPVY 
MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 
LPEYBCELLQFKKLKKQKLQHMQAESGFVQHVGF 
KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFaDS 
aSDCLHETiDIHKGDHQLEPIYRSVETFLDRDYCV 
SQGTSYNYLDPNYFPANR 


3426 


A 


2 


1553 


LFWVHDDPRWGTPRYWLGALYRNQQSSPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQraRIHHHPQ 

MLGPCRQBICGITMAAGTLYTYPENWRAFKALI 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQWQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRKYSNEDTLSVALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGKAFNQGKIFK 


3427 


A 


755 


52 


TAARRRQKGTAARRRQKGTAARRRQKGTAARR 
RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 
. AARilRQKGTAARRRQKGTAARRRQKGTAARRR 
QKGLSNLDAAEWLPPKKG\GEKKKGPFLAINEV 
VTVREYPINILKRIHGVGFKKRAPRALKEIRKFAM 
KEMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 
LSRKRl-JEDEDSPNKLYTLVTYWVTTFKNLQTV 
NVDEN 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L==Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^=possible nucleotide insertion 


3428 


A 


4 


1939 


LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSGITIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETGIAGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TOV\WALYKNNVPATYTYDEYKKGYLDQASG - 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYfflSS 

FSGFLLCPT 


3429 


A 


212 . 


1075 . 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG . 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQ\AQLGQLS 

YLAPGEDGHWVPIPEEESLQR<^WQDAAACPRGL 

QLQCRGAGGRPVLYQWAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGKLGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 


3430 


A . 


799 


1989 


INKYmiRKKIKLLSPLPPLWSHLALLQASATKWV 
LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 
TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 
PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 
faiGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 
LVRFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 
YLDILAISCDSFDEEVNCPMGRGN\GKKNHVENL 
QKL\RRWCRDYRVPFKINSVINPF\NVEEDMTEQI 
KALNPVRWKVFQCLLIEGENCGEDANLREAERFV 
IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 
. DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 
DEKMFLKRGGKYIWSKADLKLDW 


3431 


A 


5468 


2146 


ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDKELENVASFRS\!^'KRIPWVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNND.TSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEWFMGMANIHAIRNSFQYL 

RAVCSQMPDPSNWLSALESTKWLQHLSVMLKA 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

KILLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEF>ffiAFLVKLVQHTYSCLYGTFLANNPC\EREK 

RMYmGTCSWALLRAGNKNFHNFLYTPSSD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glatamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr^'Leucine, M==Metbionihe, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=VaIinc, W=Tr)'P*ophan, Y=Tyrosine, 
X=tJnknown, *=Stop codon,/=possible nucleotide deletion, 
V=possible nucleotide insertion 










MVLHPVCHVRALHLWTAVYLPASSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDFXLNQDPSGSVASISHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RLRQIEAGYKQEVEQLRRQVRELQMRLDIRHCC 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

DCLSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEfflQVSRARELMSQQLKK 

PIATASS 


3432 


A 


36 


1873 


MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 

LYRLTMDLCSKLKDYGLWQLFRTLELPLEPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLREILFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 

KIKSTFVDGLLACMKKGSISSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT ' 

KKWYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYKKIKDFARAAIAQCHQTGCVVSIMGRRR 

PLPRIHAHDQQLRAQAERQAVNFWQGSAADLC 

KLAMIHVFTAVAASHTLTARLVAQIHDELLFEVE 

DPQPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAW\ALRQAHVALSLPATAWLPLGPLP 

APSPPiPCIFRLHFVCSPRQQWEERTGFQQSIVWPS 

PRSPALYAPGRINPLGLGWPAIPWSKCLCKALKK 

K 


3433 


A 


1481 


476 


ffPKERAPGIRASCLAlTAGARPTSYGRVGCEGDV 

RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 

ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 

BCESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

LSTLQVCNWFINARRRLLPDMLRKDGKDPNQFTI 

SRRGAKISETSSVESVMGIKNFMPALEETPFHSFTv 

AGPNPTLG\RPLSAKP/SQSPGSVLARPSVICHTTV 

TAIERLSLSLSCQSVG CGQNT\DIQQIAT\RJtt,RDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 


3434 


A 


1720 


1243 


NGPVPPGGSKTKWAGGSAAEGSPRLSPSPGAAQ 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 

PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 

JGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 

LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 

RTVQLNVCSSEEVEKV/VGDCPLEPEGP\EKGMW 
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SEQID 
NO: 


Method 


Predicted 
begioning 

11 U L-l LI u& 

location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 

nucleotide 

Inmtinn 

corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Asparlic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, U=Histidine, 

T=Tcnlpiif*inp Vest VCinf T ;=T J^n^inA lVf=A/rA*'(iinn!na 
t UUICUClllCf ^^l^yalllK^ I./~1.jCU(I1I1C) l*1.~-iVJ.CiniuniUC} 

N=AsparagiDe, P=Proline, Q=GIutanine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\Fpossible nucleotide insertion 










GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 

VIUvlTKSFLPLIRRAKGRVVNISSMLGRMANPAR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSWEPG 

NFIyVATSLYSPESIQAIAKKMWEELPEVVRKDYG 

lOCYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 

ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 

lYm 


3435 


A 


842 


3595 


ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERXTBAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEffiRFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGDEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRK.GPQTVNSS 

SIYSMYLQQATPPBCNYQPAAHSALNK.SVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIQKLLYQRFNTLAGGMEGTPFYQPSPSQ 

DFNIVTLADVDNGNTNANGNLEELPPAQPTAPLP . 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPPASHPPATSTNKRTNLKKPNSERTGHGLRVR 

FNPLALLLDASLEGEFDLVQRHYEVEDPSKPNDE 

GITPLHNAVCAGHHHIVKFLLDFG VNVNAADSD 

GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 

EETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 

KGVAYALWDYEAQNSDELSFHEGDALTILRRKD 

E 


3436 


A 


3 


2604 


gsthasekmktgrsalvvtdtgdmsvlnsprhq 

scmhvdmdcffvsvgirnrpdlkgkpvavtsn 

rgtgraplrpganpqlewqyyqnkilkgkadip 

dsslwenpdsaqangidsvlsraeiascsyearq 

lgd™gmffghakqlcpnlqavpydfhaykeva 

qtlyetlas\ythnieavscdealvditenaetk 

ltpdefanavrmeikdqtkcaasvgigsnillar 

matrkakpdgqyhlkpeevddfirgqlvtnlpg 

vghsmesklaslgiktcgdlqymtmaklqkef 

gpktgqmlyrfcrglddrpvrtekerksvsaei 

nygirftqpkeaeafllslseeiqrrleatgmkg 

krltlkimvrkpgapvetakfgghgicdniartv 

tldqatdnakiigkamlnmfhtmklnisdmrgv 

gihvnqlvptnlnpstcpsrpsvqsshfpsgsysv 

rdvfqvqkakksteeehkevfraavdleissasr 

tctflppfpahlptspdtnkaessgkwnglhtpv 

svqsrlnlseevpspsqldqsvlealppdlreqve 

qvcavqqaeshgdkkkepvngcntgilpqpvgt 

vllqipepqesnsdaginlialpafsqvdpevfaa 

lpaelqrelkaaydqrqrqgensthqqsasasv 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=PbenylalanlDe, <>=Glycine, H^Histidine, 
l==lsoleucine, K^Lysine, L^Leudne, M^Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valinc, \V=Trj'ptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\F=possible nucleotide insertion 










PK2^LLHLKAAVKJEKJGlNKKKKTIGSPBaUQSPL 
NNKLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 
EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAVEFNDVKTLLREWrmSDPMEEDILQVVKY 
CTDLIEEKDLEKLDLVIKYMKKLMQQSVESVWN 
MAFDFELDNVQWLQQTYGSTLKVT 


3437 


A 


32 


4038 


SLLRLLKAQWGSSGAASEPWLGEEGCGFPSTNE 

YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARKNPLHNPWGM 

ELAASENTDSPSPRPLRPGVTLPPGALTMNTKDT 

TEVAENSHHLKIFLPKKLLECLPRCPLLPPERLRW 

NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL . 

YMIKKVKYRKDGYLWKKRKDGKTTREDHMKL 

KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 

WSREELLGQLiCPMFHGIKWSCGNGTEEFSVEHL 

VQQELDTHPTKPAPRTHACLCSGGLGSGSLTHKC 

SSTKHRHSPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTADLLLTGLEQRAGGLTPTRHLAPQADPR 

PSMSLAVWGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEA AHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSILERLEQMEKRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVVVLVESMiP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

lETLSQWRSVETGSLDLEQEYDPLNVDHFSCTPL 

MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP . 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 

DYEATNSKGPLSSLPALPPASDDGAAPEDADSPQ 

AVDVIPVDMISLAKQIIEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYL\ENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLIFALLTL\SD\QEQRELYEAARVIQTAFRKYK 

GRRLKEQQEVAAAVIQRCYRKYICQLTWIALKFA 

LYKKMTQAAILIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 


3438 


A 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATFTQSLK 

WTKRGSADGCTDWSroiKKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

EPIAFDGSRMSKEEDSIWFRPTLLQDSGLYACVIR 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDffiDFLLPTREPEILWYKECRTKT 

WRPSr/FKRDTLLIREVREDDIGNYTCELKYGGF 

WRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFIE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C>=Cysteine, I>=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glydne, H=Histidine, 
I~Isoleucine, KF=Lysine, L=Leucine, M^^Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serinc 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V^ossible nucleotide insertion 










DLDENRVWESDI\KILKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYTV 

ELAGGLGAILLLLVCLVTIYKGYKIEIMLFYKlSfHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGTYI 

EDVARCVDQSKRLnVMTPNYVVRRGWSIFELET 

RLR2«JMLVTGEIKVILffiCSELRGIMNYQEVEALK 

HTIKLLTVIKWHGPKCNKLNSKFWKRLQYEMPF 

KRIEPrraEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLINGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 


3439 . 


A 


251 


2037 


GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDWTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGDIAVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNWIQERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGAITMKTSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

LKSLCTKKIFLYKQVFPLNLERATLAnGLIGLKGS 

ILSGTELQARWVTRVFKGLCKRPASQKLMMEAT 

EKEQLIKRGVFKDTSKDKFDYIAYMDDIAACIGT 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYRNLMGPG : 

KWDGARNAILTQWDRTLKPLKTRIVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLL1CK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 


3440 


A . 


1 


.3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCffiSVM 
ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 
ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 
IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 
AKHTSALCNACRIASSKTANPVAKRHFVQSAKE 
VANSTANLVKTIKALDGDFSEDNKNKCRIATAPL 
ffiAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 
SAKPMLESSSYLIRTARSLAINPKDPPTWSVLAG 
HSHTVSDSIKSLITSmDKAPGQRECDYSIDGrNRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSW 
QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 
. LELAAVGVASKILDHQQQMTVLDQTKTLAESAL 
QMLYAAKEGGGNPKAQHTHDAITEAAQLMKEA 
VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 
DEGTPPEPKGTFVDYQTTWKYSKAIAVTAQEM 
MTKSVTNPEELGGLASQMTSDYGHLAJFQGQMA 
AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGVAL 
QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 
NKGTQACITAATAVSGIIADLDTTIMFATAGTLN 
AENSETFADHRENILKTAKALVEDTKLLVSGAAS 
TPDKLAQAAQSSAATTTQLAEWKLGAASLGSD 
. DPETQWLINAIKDVAKALSDLISATKGAASKPV 
DDPSMYQLKGAAKVMVTNTVTSLLKTVKAVEDE 
ATRGTRALEATIECnCQELTVFQSKDVPEKTSSPE 
ESIRMTKGITMATAKAVAAGNSCRQEDVIATAN 
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SEQD) 
NO: 


Method 


Predicted 
begianing ' 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Asparlic Acid, 
E=Glutamic Add, F=^Pbenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^'Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Thrconinc, V=VaIine,W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A^ossible nucleotide deletion, 
\r=possible nucleotide insertion 










LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTECTLGYLDLLEHVLVJLQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASEEAAAiaCLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGSIPANAADDGQWSQGLISAARMVAAATSS 

LCEAANASVQGHASEEKLISSAKQVAASTAQLL 

VACKVKA.DQDSEA1VIRRLQAAGNAVKRASDNL 

VRAAQKAAFGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKERELEEARKKLAQIRQQQYKFLPTEL 

REDEG 


3441 


A 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSALHS 
PAHRPPGFSVAQKPFGATYVWSSnNTLQTQVEV 
KKRRHRLKRHNDCFVGSEAVDVIFSHLIQNKYF 
GDVDIPRAKWRVCQALMDYKVFEAVPTKVFG 
KDKKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY 
SPARYADALFKSSDIRSASLEDLWENLSLKPANS 
PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 
DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 
KAYSDSQEDEWLSAAIDCLEYLPDQMWErSRSF 
PEQPDRTDLVKELLFDAIGRYYSSREPLLNHLSD 
VHNGIAELLVNGKTEIALEATQLLLKLLDFQNRE 
EFRRLLYFMA VAANPSEFKLQKESDNRM VVKRI 
FSKAIVDNKNLSKGKTDLLVLFLVMDHQKDVFKI 
PGTL\HKIVS\VK\LMAIQNGRDPNRDAGY1YCQRI 
DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 
■ KKKVLLGQFYKCHPDIFIEHFGD 


3442 


A 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ 

VAEDKFVFDLPDYESIMJVWFMLGTIPFPEGMG . 

GSVYFSYPDSNGMPVWQLLGFVTNGKPSAIFKIS 

GLKSGEGSQHPFGAMNP/RTPSVAQIGISVELLDS 

MAQQTPVGNAAVSSVDSFTQFTQKMLDNFYNF 

ASSFAVSQ/VPDDTQ/RPSEMFIPANWLKWYENF 

QRRTSTEPSLLENIIWrKTNF 


3443 


A 


3 


1373 


swhvrrrwleatmaggmkvavspavgpgpwg 

sgvggggtvrlllilsgclvygtaetdvnWml 

qesqvcekrasqqfcytnvlipqwhdrwtriqir 

vnssrlvrvtqveneeklkeleqfsiwnffssfl 

keklndtyvnvglystktclkveiiekdtkysvi 

verrfdpklflvfllglmlffcgdllsrsqifyys 

tgmtvgivasl\liiifilskfmpkkspiyvilvggw 

sfslyliqlv7knlqeiwrcywqyllsyvltvgf 

msfavcykygplenersinlltwtlqlmglcfm 

YSGIQIPHIALAiniALCTKNLEHPIQWLYlTCRKV 
CKGAEKPVPPRLLTEEEYRIQGEVETRKALEELR 
EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 
PNEVSVHEQEYGLGSIIAQDEIYEEASSEEEDSYS 
RCPAITQNNFLT 


3444 


A 


566 . 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 
EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

AAP QTTP A XTVT A P'HTDP T rz A TTF TM5TXJOT /^TXTCOT CT 
lVJJ\olir/\lN I lAivJJ 1 i\J\J_.ijA iil^iJjSJxloi-fV^liNooJLb I 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 
GGQDTFME>JYFTSQRDNIFRNVEVLIYVFDVESR 
ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 
LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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SEQDO 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
EXSIutamic Add, F^Phenylalanine, G=Glycine, H^Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
^=possible nucleotide insertion 










DETLYKAWSSrVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNl 

nCQFKLSCSKLAASFQSMEVKNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAA'n.rNIRNARKHFEKLERV 

DGPKQCLLMR 


3445 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYlARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC . 

GGQDTFMENm^QRDNIFRNVEVLrYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 

DETLYKAWSSrVYQLIPNVQQLEMNLKNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNl 

nCQFKLSCSKLAASFQSMEVKNSNFAAFIDIFTSN 

TYVMWMSDPSIPSAATLINIRNARiaiFEKLERV 

DGPKQCLLMR 


3446 


A 


566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDtRRLGATILDRIHSLQINSSLST 

. YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 
GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

,ELEKDMHYYQSCLEAILQNSPDAKFCLVHKMD 
LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
DETLYKAWSSIVYQLIPNVQQLEMNLKNFAEIIE 
ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 
IKQFKLSCSKLAASFQSMEVR2>}SNFAAFIDIFTSN 
TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 
DGPKQCLLMR 


3447 


A 


1 


2930 


VLLGPLWDKLSTADHPVrVTMASKRKSTTPCMEP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEKnrOFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEraT 

KTPIMKIMKGKAEAKKIHTLKENVPSQPVGEALP . 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHJOTYPTKAELCYLTVVTKYPEEQLKIW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHWGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSmDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 

WFSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEWRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQlD 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
' corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A<=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutaraic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I>Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R==Arginine, S=Serine, 
T=Threoninc V=Valine, 'W=Tr)'ptoplian, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, A=possible nucleotide deletion, 
\r=possible nucleotide insertion 










MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 

TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 

VSENSESWEPRVPEASSEPFDMSSPQAGRQLETD 


3448 


A 


2 


1324 


FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNa.TKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNKIDPHAPNEMLYGRIGYIY 

ALLFVNKNFGVEKIPQSHIQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LMQPSLQVSQGKLHSLVKPSVDYVCQLKFPSGN 

YPPCIGD>fRDLLVHWCHGAPGVIYMLIQAYKVF 

R/EREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

GSAGNAYAPLTLYNLTQDMKYLYRACKFAEWC 

LEYGEHGCRTPDTPFSLFEGMAGTIYFL\ADLLFP. 

TKAR\FPAFEL > 


3449 


A 


3 


2389 


SRfTVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGBCLYAMKVLRKAALVQRAK . 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLELDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGHYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTEEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGELLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPREFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGWHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

ESCDLWSLGVTLYVMMLSGQAPFQGASGQGGQS 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAYRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR . 

AiPVASKGAPRRANGPLPPS 


3450 


A 


201 


.1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

VPELPGFYFDPEKKRYFRLLPGHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQLGFLN\TNYCHLAHELRLSCMERKKVQIRS 

MDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

KYGHNLQSLKTPTLKVFMHENLYFTNRKVXNSV 

CWASLNHLDSHILLCLMGLAETPGCATLLPASLF 

VNSHPAGIDRPG\MLCSFRrPGAWSCA WSLNIQA 

NNCFSTGLSRRVLLTNWTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFAIDLRCGNQGKGW 

KATRLFHDSAVTSVRILQDEQYLMASDMAGKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

LVAVGQDCYTRIWSLHDARLLRTIPSPYPASKAD 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide . 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycinc, H^Histidine, 
f~lsolcucine, K^Lysine, Lr=Leucinc, M^Methionine, 
N=Asparagine, P=Proline, Q^GIutaminc, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=TjTosine, 
X=HJnknown, *=Stop eodon, /^possible nucleotide deletion, 
V^possible nucleotide insertion 










IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 


3451 


A 


19 


6033 


LLSAMLSHGAGLALWTTLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSLIYSSDTLGA 

ALCPTFRIDNTTYGC>nLQDLQAGTrYNFKnSLDE 

ERTWLQTDPLPPARPGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGWDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPG\NVDSYNITLSHKGT1KESR 

VLAPWI'RETHFKELVPGRLYXQVTCSAVSLGELS 

AQKMXAVGRTFPDKVANLEANNNGRMRSLWS 

WSPPAGDWEQYRILLFNDSWLLNirVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGM.KNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

.NESISSETSRYSFHSLKSGSLYSVWTTVSGGISSR 
QVVVEGRTVPSSVSGVTVNNSGRNDYLSVSWLL 
APGDVDNYEVTLSHDGKWQSLVIAKSVRECSF 
SSLTPGRLYTVTITTRSGKYENHSFSQERT VPDKV 
QGVSVSNSARSDYLRVSWVHATGDFDHYEVTIK 
NKNNFIQTKSIPKSENECVFVQLVPGRLYS VTVT 

: TKSGQYEANEQGNGRTIPEPVKDLTLRNRS7EDL 
HVTWSGANGDVDQYEIQLLFNDMKVFPPFHLVN 
TATEYRFTSLTPGRQYKILVLTISGDVQQSAFffiG 
FTVPSAVKNIfflSPNGATDSLTVNWTPGGGDVDS 
YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 
YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 
AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 
NTSEPATTKQHKFEDLTPGKKYKIQBLTVSGGLFS 
KEAQTEGRTVPAAVTDLRITENSTRHLSFRWTAS 
EGELSWYNIFLYNPDGNLQERAQVDPLVQSFSFQ 
M.LQGRMYKMVIVTHSGELSNESFIFGRTVPASV 
SHLRGSNRNTTDSLWFNWSPASGDFDFYELILYN 
PNGTKKENWKDKDLTEWRFQGLVPGKKYVLW 
WTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 
SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 
YNNRKSEGRIVYGLRPGRSYQFNVKTVSGDSWK 
TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 
PPDSDFDGYSIECRKMDTQEVEFSRKLEKEKSLL 
NIMMLVPHKRYLVSnCVQSAGMTSEVVEDSTIT 
MIDRPPPPPPHniVNEKDVLISKSSINFTVNCSWFS 
DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 
LEYRHNASIRVYQTNYFASKCAENPNSNSKSFNI 
KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 
SERAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 
LFGAffiGVSAGLFLlGMLVAWALLICRQKVSHG 
RERPSARLSIRRDRPLSVHLNLGQKGNRICTSCPIK 
INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 
SCDIALLPENRGJtNRYKNILPYDATRVKLSNVDD 
DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 
WKMVWEQ>rVHNIVMVTQCVEKGRVKCDHYW 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide - 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=MethioDine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine,W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










PADQDSLYYGDLILQMLSESVLPEWTIREFKICGE 

EQLDAHRLIRHFHYTVWPDHGVPETTQSLIQFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQLDSKDSVDIYGAV\HDLRLHRVHMVQTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 

NPEYHRDPVYSRH 


3452 


A 


63 


1073 


FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVWDTPGIFDTE 

VPNAETSKEIIRCILLTSPGPHALLLWPLGRYTEE 

EHKATEKILKMFGERARSFMILIFTRKDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLtGLIQRVVRENKEGCYTNRMYQR 

AEEEIQKQTQAMQELHRVELEREKARIREEYEEK 

IRKLEDKVEQEKRKKQMEKKLAEQEAHYAVRQ 

QRARTEVESKDGELELIMTALQIASFILLRLFAED ' 


3453 


A 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAV 
DKICVDCPRLCTCEmPWFTPRSrmEASTVDCND 
LGLLTFPARLPANTQILLLQTNNIAKIEYSTDFPV 
NLTGLDLSQNNLSS VTNINGKKMPQLLS VYLEEN 
KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 
LHNLLRLHLNSNRLQMINSKWFDALPNLEILMIG 
ENPIIRIKDMNFKPLINLRSLVIAGINLTEIPDNAL 
VGLENLESISFYDNRLIKVPHVALQKWNLKFLD 
LNKNPINRIRRGDFSNMLHLKELGINNMPELISID 
SLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKL 
ESLMLNSNALSALYHGTIESLPNLKEISmSNPIRC 
DCVDlWMNfMNKTNIRFMEPDSLFCVDPPEFQGQ 
NVRQVHFRDMMEICLPLIAPESFPSNLNVEAGSY 
. VSFHCRATA\EPQPEIYW1TPSGQKLLPNT\LTDKF 
YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 
SVMIKVDGSFPQDNNGSLNIKIRDIQANSVLVSW 
KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 
KVYNLTHLNPSTEYKICroiPTIYQKNRKKCVNVT 
TKGLHPDQKEYEKNNTTTLMACLGGLLGnGVIC 
LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 
PLDvTLWEAGKEKSTSLKVKATVIGLPTNMS 


3454 


A 


1844 


244 


ERYLFATYVAPSATLDIGLQQEKKKEIYMKIQPP . 

FEDLFDTAEEYILLLLLEPWTKMVKSDQIAYKKV 

ELVEETRQLDSTYFRKLQALHKETFSKKAEDTTC 

EIGTGILSLSNVSKRTEYWDNVPAEYKHFKFSDL 

LNNKLEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNQRKAKSIYIKNKYLNKKYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

HVQNRLENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKKITTIINCFINSSIPPALQl 

DIPVEQAQKIIEHRKELGPYVFREAQMTFLGVMF 

KFWPQFCEFRKNLTDENIMSVLERRQEYNKQKK 

KLAVL/QNDEKSGKDGIKQYANTSVPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERJLLKIQE 

ELEK\SCLQACNLSQILRLALQLCL 


3455 


A 


228 


3330 


APTAQAMMSFGGADALLGAPFAPLHGGGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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SEQQ) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanlne C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F^'Phenylalanine, G=Glycine, H^Histidine, 
I~Isoleucjne, }C=Lystne, L^Leuctne, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, '>=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










aspsrfrgagaasstdsldtlsngpegcmvava 

tsrsekeqlqalndrfagyidkvrqleahnrsle 

geaaalrqqqagrsamgelyerevremrgavl 

rlgaargqlrleqehllediahvrqrlddearq ' 

reeaeaaaralakfaqeaeaarvdlqkkaqal 

qeecgylrrfihqeevgellgqiqgsgaaqaqm 

qaetrdalkcdvtsalreiraqleghavqstlq 

seewfrvrldrlseaakvntdamrsaqeeitey 

rrqlqarttelealkstkdslerqrseledrhqa 

diasyqeaiqqldaelrntkwemaaqlkeyqdl 

lnvkmaldeeiaayrkllegeecrigfgpipfslp 

eglpkipsvsthikvkseekikvvekseketvivee 

qteetqvteevteeedkeakeeegkeeeggeeee 

aeggeeetksppaeeaaspekeakspvkeeaksp 

aeakspekeeakspaevkspekakspakeeaksp 

pe\akspekdgkqnfqaevkspekakspakeeak 

spaeakspekakspvkeeakspaeakspvkeeak 

spaevkspekaksptkee\akspekakspekaksp 

ekeeakspekakspvkaeakspekakspvkaea 

kspekakspvkeeakspekakspvkeeakspeka 

kspvkeeaktpekakspvkeeakspekakspeka 

ktldvkspeaktpakeeArspadkfpekakspvk 

eevkspekaksplkedakapekeipkkeevkspv 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 
DSKKEEAPBLKEAPKPKVEEKKEPAVEKPKESKV 
EAKKEEAEDKKKVPTPEKEAPAKVEVKEDAKPK 
. EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 
TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 
KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 


3456 


A 


258 


1463 


YLSFDPGHASKSAPMNGHCFAENGPSQKSSLPPLL. 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPGVRRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCILPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKXKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP . 


3457 


A 


2 


4869 


FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEICENLPSDY 

MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSWSKQATSALQQEETSEKKS 

RKWIRGKAERLSLRKTLEETETYHAKLNEDHLL 

HAPEFIIKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAGASTMPLSFGVTPYGYASRFEIHFD 

DKFDVSFGREGETMSLGCRWITPEIKHFQPEIQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide ' 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, l>=Aspartic Add, 
E=Glufamic Acid, F=PhenylaIaninc, G=Glycine, H=Histidine, 
I=Isoleuciue, Lysine^ l^jLcucinc, M~Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine,W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /possible nucleotide deletion, 
\=possible nucleotide insertion . 




' - ■ 






WYKNGVPLSPSKWVQTLWSGERATLTFSHLNKE 
DEGLYmVRMGEYYEQYSAYVFVRDADAEIEG 
APAAPLDVKCLEANKDYUISWKQPAVDGGSPIL 
GYFIDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 
. GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 
ARLKS/PPLSTLDWAVIVTEEEPSEGIVPGPPTDLS 
VTEATRSYWLSWKPPGQRGHEGIMYFVEKCEA 
GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 
VRCSNSAGVGEPSEATEVTVVGDKLDIPKAPGKI 
IPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 
GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 
RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCDITC 
LESFRDSMVLGWKQPDKIGGAEITGYmiYREV 
roOVPGKWREANVKAVSEEAYKISNLKENMVY 
QFQVAAMNMAGLGAPSAVSECFKCEEWnAVP 
GPPHSLKCSEVRKDSLVLQWKPPVHSGRTTVTG 
YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 
LKEGVSYVFRVRAINQAGVGkPSDLAGPVVAET 
RPGTKEWVNVDDDGVISLNFECDKMTPKSEFS 
WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 
DDLGIYSCD VTDTDGIASS\'LIDEEELKRLLALSH 
EHKFPTVPVKSELAVEILEKGQVRRWMQAEKLS 
GNAKVNYEFNEKGIFEGPKYKMHIDRNTGIIEMF 
MEKLQDEDEGTYTFQLQDGKATNHSTWLVGD 
VFKKLQKEAEFQRQEWERKQGPHFVEYLSWEVT 
GEC^TVLLKCKVANIKkETHlVWYKDEREISVDE 
KHDFKDGICTLLITEFSKKDAGIYEVILKDDRGK 
DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ 
STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

rvktgvtgeqiwlqinbptpndkgkyvmelfdg 
ktghqktvdlsgqaydeayaefqrlkqaaiaek 
nrarvlgglpdwtiqegkalnltcn vwgdppp 
evswlknekalasddhcnlkfeagrtayfting 
vstadsgkyglwknkygsetsdftvsvfipeee 

ARMAALESLKGGKKAK 


3458 


A. 


3963 


827 


LSRSSSDNNTNTLGRNVMSTATSPLMGAQSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

EEVMILRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTlFYYVQKLLQLSCNG>fVKSDKLRRIWEPTY 

HMYREMKDSDKEKENGKMGCWSDBHVEQYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNKNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYTVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

KKITTKILQQIEEPLALASGALPDWCEQLfSKCPF 

LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAENVMQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to {irst amino 

acid residue of 

peptide 

sequence 


Predicted eiid 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine OCysteine, D=Aspartic Acid, 
E=Clutamic Acid, F=Ph?nylalaninc, G=Glycine, H=Histidine, 
I==Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q==Glutaminc, B=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
V^ossible nucleotide insertion 










TKLFHFLGIFLAKCIQDNRLVDLPISKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFILDPPKPKPPAWFNGILT 

WEDFELVOTHRARFLKEDCDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSDEDLGLNFQF 

CPSSRIYGFTAVDLKPSGEDEMITMDNAEEYVDL 

MFDFCMHTGlQKQMEAFRDGFNKVFPMEKLSSF 

SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRFVRVLCGMSSDERKAFLQFTTGCSTLP 

PGGLANLHPRLTWRKVDATDASYPSVNTCVHY 

LKLPEYSSEEIMRERLLAATMEKGFHLN 


3459 


A 


88 


603 


SCGPRGLASLGLGFSGRCDDQNKGRS\DGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQIRRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRPTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 


3460 


■A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 
EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

llqsmgltpespivpppmspssksvstpseagsqd 

sgdgavgsrrgprklgmakitqvdfppreivtyt 

ketqtpvmaqpkedeeedddwapkppiepeeek 

tlkkdeen\dskapphelteeekqqilhseeflsff 

dhstriveralseqiniffdysgrdf/endkegeiq 

agaklslnrqff\der\wskasgwvscldwssq 

yp\ellvasynnnedaphepdgvalvwnmkyk; 

kttpeyvfhcqsavmsatfakfhpnlwggtys 

gqivlwdnrsnkrtpvqrtplsaaahthpvycv 

nwgtqnahnlisistdgkicswsldmlshpqds 

melvhkqskavavtsmsfpvgdvnnfwgsee 

gsvytacrhgskagisemfeghqgpitgihchaa 

vgaxndfshlyvtssfdwtvklwttknnkplysf: 

ednagyvydvmwspthpalfacvdgmgrldl 

wnlnndtevptasisvegnpalnrvrwthsgre 

lavgdsegqiviydvgeqiavprndewarfgrtl 

aeinanradaeeeaatripa 


3461 


A 


139 


1997 


QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKJCKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGpGAVGSRRGPIKLGMAKITQVDFPPRErVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEENVDSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTKlVERALSEQINffFDYSGRDF/ENDBLEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLWGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NWGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFWGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWrnOWKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

lAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRtL 

AEINANRADAEEEAATRIPA 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence . 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PbenyIalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysinc, Lr=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Argininc, S=Scrine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\Fpossible nucleotide insertion 


3462 


A 


2 


2643 


TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRJRL 

AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 

AESGARSVSSrVRQWNRKINHFLGDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRimi 

VMQEUVltWNLEADMERLIKKREELFLLQEALRR 

KRERLQAESPEEEKGLQELAEEIEVLAANIDYIND 

GITDCQATIVQLEETKEELDSTDfSWISSCSLAE 

ARLLLDNFLKASIDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPTi'RDRVSRTVSL 

PniGSTFPRQSR^TETSPLTRRKSYDRGQPIRSTD , 

VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSL\SEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

LVTGQEIAALKGHPNNWSnCYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGKLTGHIGPVMCLTVTQTASQHDL 

WTGSKDHYVKMFELGECVTGTIGPTHNFEPPH . 

YDGIECLAIQGDILFSGSRDNGIKKWDLDQQELIQ 

QIPNAHKDWVGALAFIPGRPMLLSACRAGVIKV . 

WmTDNFTPIGEKGHDSPINAICTNAKHIFTASSG 

CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 


3463 


A 


198 


3146 


SGEPRPEPGNMATCIGEKIEDFKVGNLLGKGSFA 
GVYRAESIHTGLEVAIKMIDKKAMYKAGMVQR . 
VQNEVKIHGQLKHPSILELYNYFEDSNYVYLVLE 
-MCHNGEMNRYLKNRVKPFSENEARHFMHQnTG 
MLYLHSHGmHRDLTLSNLLLTRmiNIKIADFGL 
ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 
SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV. 
LADYEMPTFLSIEAKDLIHQLLREINPADRLSLSSV 
LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 
TASSSTSISGSLFDKJIRLLIGQPLPNKMTVFPKNK 
SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 
AEERPHSRYLRRAYSSDRSGTSNSQS.QAKTYTM 
ERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 
FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 
PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS 
DNAHSVKQQNTMKYMTALHSKPEnQQECVFGS 
DPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHR 
LKPIRQKTKKAWSILDSEEVCVELVKEYASQEY 
VKEVLQISSDGNTITIYYPNGG\RGFPLA\DRPPSP 
T\DNISR\YSF\DNLPEKYWRKYQYASRFVQLVRS 
KSPKITYFTRYAKCIL^^ENSPGADFEVWFYDGV 
KIHKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIK 
MYMDHANEGHRICLALESnSEEERKTRSAPFFPn 
IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 
VMHSAASPTQAPILKTSMVTNEGLGLTTTASGTD 
ISSNSLKDCLPBCSAQLLKSVFVKNVGWATQVLTS 
GAVWVQFNDGSQLVVQAGVSSISYTSPNGQ\TTR 
\YGENEKLPDYIKQKLQCLSSILLMFSNPTPNFH 
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SEQBD 
NO: 


Method 


Predicted 
beginning 
nucleotide 
. location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I~IsoIcucine, IC=Lysine, jL^l^ucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, V=Tyrosine, 
X^UnknoTvn, *^top codon, A=possible nucleotide deletion, 
^n>ossible nucleotide insertion 


3464 


A 


14 


348 


AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMASKTIPELLKWIEDGIPKDPFLNPDLMKNNPW 
V\EKGKCTIL 


3465 


A 


5537 


405 


VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELKLNWLLAKALWVLARRCYTLQEE 

MKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGA WPEAGGQSA . 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWICEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGWDRGLCQMVRVGSKTEVATEILDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLWSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQICLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRKEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

iEERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKVIKMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDKR 

WRSPAQGHLPSVAHLEDFQGLTIPQGSSLVLQGN 

SKRLPL>^TTKIMIAALDYDPGDGQMGGQGKGRL 

ALRAGDWMVY\GPMDDQGFYYGELGGHRG\L 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lys)ne, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R=Arginine, S=Serinc, 
1=1 brconme, V=Valmc, W=rryptopnan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V^ossible nucleotide insertion 










VPANLRKMSSQGH 


3466 


A 


1 


nil . 


MSKPPDLLLRLLRGAPRQRVCTLFIIGFKFTFFVSI 

MIYWHWGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNIFFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASRIAmWKFGGIYLDTDFIVLKNLRNLT 

NVLGTQSRYVLNGAFLAFERRHEFMALCMRDFV 

DHYNGWIWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRG VTTLPPEAFYPff WQD WKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 


.3467 


A 


1 


2175 •. 


MAKVBLKQSKQCKNLLTCKVAQVCPVCGCLHC 
YFWWLSGLESRRPSSPLIDIKPIEFGVLSAKKEPIQ 
PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 
TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA " 
RDLPPPISHDGSRQDMAHSNPYVKICLLPDQKNS 
KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 
. TWDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 
WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 
LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 
NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 
LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLP\SL 
QRGEGEAMLS\ALTLFSRSPLEQNIIQPLVLSLLHL 
CGSVVNMPPGNSQPRGDFLYHSICTWVQDNYAQ 
PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 
WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 
CRVFRRQFGMDYVDILQIHRWDWTPIEETLEAL 
NDWKAGKARYIGASSMHASQFAQALELQKQH 
GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV ' 
AVIPWSPLARGRLTRPWGETTARLVSDEVGBCNL 
YKESDENDAQIAERLTGVSEELGATRAQVALAW 
LLSKPGIAAPnGTSREEQLDELLNAVDITLKPEQI 
AELETPYKPHPWGFK 


3468 


A 


147 


3209 


ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 

STDPPVMVnGGQENPNNSSASSEPRPEGHNNPQ 

VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK 

PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAFNKFVLMKAVEPKNKADENTPPGSE 

OSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 

FK5TGSFPLPLCARALG\ASPSETSKLQQLVEKID 

RQGAVAVTSAASGAPTTSAPAPSSSASSGPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
' nucleotide 
location' 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=C)'stcine, p=Aspartic Add, 
E^GIutamic Acid, F=Phcnylalaninc, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Metfaionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaUne, W=Tryptophan, Y=Tyrosine, 
X=Unknon'n, *=Stop codon, ^possible nucleotide deletion, 
V^possible nucleotide insertion 










NAVTLQQHVRMHLGGQEPNGGTALPEGGGAAQ 

ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 

EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 

EEASGAEEEVGTVAAAATAGKEMDSNEKTTQQS 

SLPPPPPPDSLDQPQPMEQGSSGVLGGKEEGGKP 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QK'raPKEG 

PLRTCVFCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 


3469 


A 


3 


5664. 


NLRPLSFALFLGDPNMANLEESFPRGGTRICIHKP 
EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 
TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG ■ 
CVKEV>fELELVISLPNGLQGFVQVTEICDAYTKK 
LNEQVTQEQPLKDLLHLPELFSPGMLVRCWSSL 
GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 
LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 
IRQKNKGAKLKVGQYLNCrVEKVKGNGGWSLS 
VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 
KVTPFGLTLNFLTFFTGVVDFMHLDPKKAGTYFS 
NQAVRACILCVHPRTRWHLSLRPIFLQPGRPLTR 
LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 
AYARLSHLSDSKNVFNPEAFKPGNTHKCRIIDYS 
QMDELALLSLRTSHEAQYLRYHDIEPGAVVKGT 
VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 
NPEKKYHIGDEVKCRVLLCDPEAKKLMMTLKKT 
LIESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV . 
KFYNNVQGLVPKHELSTEYIPDPERVFYTGQW 
. KVWLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 
QKKGKAJNIGQLVDVICVLEKTKDGLEVAVLPHN 
IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 
CLSQSEGRVLLCRKPALVSTVEGGQDPPCNFSEIH 
PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 
DBCFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 
SLRLSDCGLGDLAITSLLLLNQCLEELQGVRSLM. 
SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 
VFSGGPVPDLVLKASRYHRAGQEVESGQKKKVV 
ILNVDLLKLEVHVSLHQ\DLV\NRKARKLRKGSE 
HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 
TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 
GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 
KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 
HASHILDDVPEGTSPTTKLKVGKTVTARVIGGRD 
MKTFKYLPISHPRFVRTIPELSYRPSELEDGHTAL 
NTHSVSPMEKIKQYQAGQTVTCFLKKYNWKK 
WLEVEIAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 
GQALRATWGPDSSKTFLCLSLTGPHKLEEGEVA 
MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 
SETPLEDFVPQKWRCYILSTADNVLTLSLRSSRT 
NPETKSKVEDPEINSIQDIKEGQLLRGYVGSIQPH 
GVFFRLGPSWGLARYSHVSQHSPSKKALYNKH 
LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 
VLS ASLEGQLTKQEERKTEAEERDQKGEKKNQK 
RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 
GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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SEQID 
NO: 


Method 


Predicted 
beginning 
oucleotide 
location 
' corresponding 
to first amino 
acid residae of 
peptide 
sequence 


Predicted end 
' nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Pbenylalanine, G=Glycine, H=HistidiDe, 
I^Isoleucine, K^Lysine, I^Leucine, M=Metbionine, 
N=Asparagine, P=ProIine, Q=Glutaniine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, '^Stop codon, /=possible nucleotide deletion, 
V^ppssible nucleotide insertion 










YYREGKEEAEETNVLPKEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQATI 

KKSKKERELEKQKAEICELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSILWLQYMAFHLQATEIEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

EKFQEAGELYNRMLKRFRQEKAVWIKYGAFLLR 

RSQAAASHRVLQRALECLPSKEHVDVIAKFAQL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVYID 

MTDCHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQA VKAKALEYVEAKSS VL 

ED . 


3470 


A- 


2334 


1226 . 


TAAAPVAPGTMDDATVLRKKGYJVGINLGKGSY 

AKVKSAYSERLKFNVAVKIIARKKTPTDFVERFL 

PREMDILATVNHGSIKTYEIFETSDGRIYIIMELG 

VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLRDSNGRIILSKTFCGSAAYAAPEVLQSIPYQPK . 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLRIQK 

EHRVDFPRSKNLTCECKDLIYRMLQ\PDVS\KRLH 

IDEILSHSWLQPPKPK\ATSSASFKREGEGKYRAE 

CKLDTKTGLRPDHRPDHiaGAKTQHRLLVVPEN 

ENRMEDRLAETSRAKDHHISGAEVGKAST 


3471 


A 


537 . 


148 


TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD . 
GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 


3472 


A - . 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHWFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNWFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 

VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILiDPH 

WLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHASXAAEDNYGYDACAVLCLPCVPN 

ELVIATESGMLYHCWLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHELCTBCPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSELQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQ YILKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SU>KPTIILSAYQRkCIQSIIiaSEGEHDDElEMVKQIN 

DIRNHVNF 


3473 


A 


1 


2272 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNVWGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGDCGLMVLELPKRWGKNSEFEGGKST 
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SEQID 
NO: 


Method 


Predicted ■ 
beginning 
nucleotide 
locatioD 
■ cDrresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanioc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L=Lcucine, M=Metbionine, 
N=Asparagine,P=Proline, Q^GIutamine, R=Arginine, S=Serioe, 
T=ThreoDine, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, /^possible nucleotide deletion, 
\=pos5ible nucleotide insertion 


- ■■ ■ . 








VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILDPH 

VVLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHASVAAEDNYGYDACAVLCLPCVPN 

ILVMTESG3VILYHCVVLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKDIAPPPEECLQLLS 

RATQVFREQYELKQDLAKEEIQRRVKLLCDQKK 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 


3474 


A 


4344 


2550 


DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 
KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 
EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 
HLEIALLEKHFLQEELRKLREETNAEMLRQELDR 
ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 
AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

. RFQPEENTVETEEPLS ARRLTENMRRLKRGAKPV 
TNFVKNLS ALSD WS VYTSAIAFTVYMNAVWH 
GWAIPLFLFLAILRLSLhri'LIARGWRIQWSrVPEV 
SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 
MADILEKIKNLFMWVQPEITQKLYVALWAAFLA . 

. SCFFPYRLVGLAVGLYAGIKFFLIDFIFKRCPRLR 
AKYDTPYIIWRSLPTDPQLKERSSAAVSRRLQTTS 
SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 
LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 
GVLYVTAENYLCFESSKSGSSKKNKVIKLVDITDI 
QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 
HRDEAFETILSQYIKTTSAAASGGDS 


3475 


A 


2 


1126 


TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

ffiLMESRKDITNQEELWKMKPRRNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPnOAAIIASLTFLYTLLREVIHPLA 

TSHQQYFYKJPILVINKVLPMVSITLLALVYLPGV 

lAAIVQLHNGTKYKKFPHWLDKWMLTRKQFGL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALMEHDVWRMEIYVSLGIVGLAILAL 

LAVTSIPSVSDSLTWREFimQSKLGrVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPIVVLI 

FKSILFLPCLRKIOLKIRHGWEDVTKINKTEICSQL 


3476 


A 


143 


3191 


AKAPPTGESSEPEAKVLHTKRLYRAVVEAVHRL 

DLILCNKTAYQEVFKPENISLRNKLRELCVKLMF 

LHPVDYGRKAEELLWRKVYYEVIQLIKTNKKHI 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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SEQID 
NO: 


Method 


Predicted 

begioniag 

aucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residneof 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F^^'Fbenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=L)'Sine, Lr^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Argininc, S=Serine, 
T=Threonine, V=Valine, 'W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /"possible nucleotide deletion, 
V=possible nucleotide insertion 










KKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQ 

PICSSSVDSELTSLCQSVLEDFNLCLFYLPSSPNLS 

LASEDEEEYESGYAFLPDLLIFQMVIICLMCVHSL 

ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 

BEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 

PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 

DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 

AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 

PPRGR5EAPDSLNGPLGPSEASIASNLQAMSTQM 

FQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCV 

NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 

EJaQVLMAEGLLPAVKVFLDWLRTNPDLirVCA 

QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 

QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

PJRFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 

GSILQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

QEEARRNRLMRDMAQLRLQLEVSQLEGSLQQPK 

AQSAMSPYLVPDTQALCHHLPVIRQLATSGRFIVI 

PRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 

RYIRCQJOEVGKSFERHKLKRQDADAWTLYKILD 

SCKQLTUAQGAGEEDPSGMVTIITGLPLDNPSVL 

SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 


3477 


A 


1 


3902 


MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 
KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 
GGDAVATTGEIHEEKAWKTRALEVGQPAQRDIR 
RGELWGKEHGADQAIQETLEDLSSLERTL VVSES 
SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 
ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 
LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 
HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 
LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 
ETKHAELIWKELDREIHSFFDLVLTAYDNGNPP 
KSGTSLVKVNVLDSNDNSPAFAESSLALEIQEDA 
APGTLLUCLTATDPDQGPNGEVEFFLSKHMPPEW 
LDTFSIDAKTGQVILRRPLDYEKNPAYEVDVQAR 
DLGPNPIPAHCKVLIKVLDVNDNIPSIHVTWASQP 
SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 
SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 
YTLTLLAQDQGLQFLSAKKQLSIQISDINDNAPVF 
EKSRYEVSTRENNLPSLHLITIKAHDADLGINGK 
. VSYRIQDSPVAHLVAIDSNTGE VTAQRSLNYEEM 
AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 
APEVVQPVLSDGKASLSVLVNASTGHLLVPIETP 
NGLGPAGTDTPPLATHSSRPFLLTnVARDADSG 
ANGEPLYSIRSGNEAHLFELNPHTGQLFVNVTNA 
SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 
VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 
LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 
PQKfflQKADIHLVPVLRGQAGEPCEVGQSHKDV 
DKEAMMEAGWDPCLQAPFHLTPTLYRTLRNQG 
NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 
NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 
EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 
LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 
SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaniDC C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=Plienylalanine, G=Glycine, H=Histidine, 
I~Iso)cucinc, K^Lysinc, L^X^ucine, M^Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, A^ossible nucleotide deletion, 
\=possiblc nucleotide insertion 










GPSARAGGQTDPEQEEGPLDPEEDLSVKQLLEEE 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGkAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RJRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 


3478 


A 


13 


1620 


TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLIEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSBLPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIFFADALDLFRGRKVYLEDGFAYVPLKDrVAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC ' 

MRQLHKALRENHHLRHGGRMQYGLFLKGIGLT 

LEQALQFWKQEFIKGKMDPDKFDKGYSYNIRHS 

FGKEGKRTDYTPFSCLKHLSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRI 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 


3479 


A 


698 


138 


RPELELWRLRSRSWRPLGVPRRCHRKNWKEPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI. 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\1VEMLVQEQLLAILP 

EAARARR[RRRTDVRITG 


3480 


A 


117 


2226 . 


RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSISKPNMISLLEQ 

GKEPWMVERKMSQGHGADWESWWEIEELSPK 

WFIDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVUEYKRLHAEKESLIGNECEEFNQSTYLSK . 

DIGIPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRIHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

KCGKFFRTDSQLNRHHRIHTGERPFECSKCGKAF 

SDALVLIHHKRSHAGEKPYECNKCGKAFSCGSY . 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 


3481 


A 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGLILAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDVVLLNNNPGNFDVALDI 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIHI 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F'=Phenylalanine, G=Glycine, H^Histidinc, 
I'^lsoleucine, K=Lysinc, L=Leucinc, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc R=Arginine, S=Serine, 
T=Threonine, V=Va)iDe, W=To'ptDphan, Y=Tyrosine, 
X=Unknown, *°^top codon, A^possible nucleotide deletion, 
V=possible nucleotide insertion - 










HEGTFBENGQWENIHKPSRLIQPPGDPRGGREGQ 

RQEVIFYLIIRRKPLFYLVNVIAPCILITLLAIFVFY 

LPPDAGEKMGLSEFALLTLTVFLLLLADKVPETSL 

SVPimYLMFIMVLVTFSVILSVVVLNLHHRSPH 

THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSGWGRGTDEYFIRKPPSDFLFPKPNRF 

QPELSAPDLRRFIDGPNRAVALLPELREVVSSISYI 

ARQLQEQEDHD ALKEDWQFVAMWDRLFLWTF 

nFTSVGTLWIFLDATYHLPPPDPFP 


3482 


A 


1273 


172 


ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPA WPEPQGPMGPAG V AA 

RPGRFFGVYLLYCLNPRYRVRWYVGFTVNTARR 

VQQHNGGRKKGGA\GRTSGRGPWEMVLVVHGF 

PSS VAALRFEWAWQHPHASRRLAHVGPRLRGET 

AFAFHLRVLAHMLRAPPWARLPLTLRWVRPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP . 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 


3483 : 


A , 


230 


3686 


WRP WPCEDTS WNLQVAARTLRVSS AQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPVITFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTAIESFALMVKQT 

AQMLQSFGTELAETELPNDVQS'nSSVLCAHTEK 

KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 

TDIGNSLAHVEHLLRDLANFQEKSGVFVERARA 

LSLTASSFIGNKHYAVDSmPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAIYKEYESILNQDLMEHVRKVFQKQASMEEV 

FHRRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENSSSEGGALRRGPYRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 

ELLCVLEGYAAEMDNPLMAHLLSTGLHNKKDV 

LFGNMEEIYHFHNRIFLRELENYTDCPELVGRCF 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKPVQRITKYQLLLKEM ' 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 

AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT 

KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 

GYEKAPSYSYKQSLNMAAVGITENVKGDAKKFE 

IVmiAREEVnVQAPTPEIKAAWVNEIRKVLTSQ 

LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 

NUCKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 

SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 

GPKKLVPGKYTWADHEKGGPDALRVRSGDVV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, l>=Aspartic Acid, 
E=Glutaniic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
l=Isoleucinc, K=Lysine, L^^Lcucine, M=Methionine, 
N=Asparagine, P=Proline, (>=Glutaniine, R=Arginine, S=Serine, 
T=Tbreonine, V=Valine, W=Tf5ptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, possible nucleotide deletion, 
\=possibIe nucleotide insertion . 










ELVQEGDEGLW 


3484 


A 


208 


6103 


VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

KVKVNKDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLKERYYSGLIYTYSGLFCWINPYKNLPIYS 

EEIVEMYKGKKRHEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVJQYLAYVASSH 

KSKKDQGELERQLLQANPELEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNTVFKKERKnTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAffiALAKATYERMFRWLVLRINK 

ALDKTKRQGASHGILDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFG 

LDLQPCIDLIEKPAGPPGDLALLDBECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCnHY 

AGKVDYKADEWLMKNMDPLNDNIATLLHQSSD 

KFVSELWKDVDRnGLDQVAGMSETALPGAFKT 

RKGMFRtVGQLYKEQLAKLMATLRNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRWFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLMIKALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVDGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

MEtLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEAR.VEEEEERCQKDLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQIILEDQNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLICNfKHEAMITDLEERLRR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKMQLAKXEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCER\ASRNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNIL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNS\FREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKRKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ . 

S ACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

LEDERKQRSMAVAARKKLEMDLKDLEAHIDSA 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 

EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 

AKRQAQQERDELADEIANSSGKGALALEEKRRL 

EARIAQLEEELEEEQGNTELINDRLKKANLQIDQl 

NTDLNLERSHAQKNENARQQLERQNKELKVKL 
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SEQH) 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide ' 
seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of ~ 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamie Acid, F=Phcn)'IaIaninc G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leuclne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R==Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=X]nknown, *=Stop codon, A^possiblc nucleotide deletion, 
V=possible nucleotide insertion 










QEMEGTVKSKYKASITALEAKIAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERKNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RKLQRELEDATETADAMNREVSSLBCNKLRRGDL 

PFWPRRMARKGAGDGSDEEVDGKADGAEAKP 

AE 


3485 


A 


2 


1782 


CSTGVSKAPLTYLMSYGFELGWRKGNRAVACR 

EDRGGESVGMGQESELSQVHWWEAEPVEKTPGR 

DSEATIMSLRVKmLPTLLGAV VRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLIKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLK'RVK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGERLIvrVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEMENFYVFHNGSLVIESPRFEDAGVYSCIAMNK 

QRLLMETVDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSSILSPGPASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 


3486 


A 


357 


1173 


GDPRETKVFPSRSFARNTVGVSHHQSHLFHTVSR 

lYVEDKHKILYCEVPKAGCSNWKRILMVLNGLA 

SSAYNISHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTK\LVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAIIKKYRPNACEEALINGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQMIGAPKELKFPNFKDRHSSDERTNA 

QVVRQYLKDLTRTERQLIYDFYYLDYLMFNYTT 

PFL , 


3487 


A 


2 


3281 


CDKSGAVPFSTTRSPRRPSPRSAGPSLS S VSPRSQ 
. LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 
AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 
PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 
LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 
FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 
PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 
LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 
FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 
WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 
GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 
EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 
HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA . 
RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFjEAI 
LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 
PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 
QRGEEEEAEARAKLAPGREPPSPCHSEDSLGLGA 
APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 
NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 
GKNNDFSKLVAGEYLBCFFVFTGMTLDQALRVFL 
KELALMGETQERERVLAHFSQRYFQCNPEALSSE 
DGAHTLTCALMLLNTDLHGHNIGKRMTCGDnG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqaence 


Amino acid sequence (A^^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PheDylalanine, G=Glycine, H^Histidine, 
I— Isoleucine, K=Lysinc, L=ljeucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Trj'ptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\Fpossible nucleotide insertion 










NLEGLNDGGDFPRELLKALYSSIKNEKLQWAIDE 

EELRRFLSELADPNPKVIKRJSGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMILYLQKEEYKPGKALSETELKN 

AISIHHALATRASVNYSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINWAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQ\TITHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 


3488 


A 


441 


1968 


GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGtVFQFLUSASSSTTCCESTLRSVSY 

VASGSTPAPALCCAP\YDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGAVELSGAGRRKNATRETTSTLKAWLNEHR 

knpyptkgekimlaiitkmtltqvstwfanarrr 

lkkenkmtwapknkggeerkaeggeedslgcl 

tadtkevtasqearglrlsdledleeeeeeeeea 

edeewatagdrltefri<:gAqslpgpcaaareg 

rlerrecglaaprfsfndpsgseeadflsaetgsp 

rltmhypclekpriwslahtatasavegapparp 

rprspecrmipgqppasarrlsvprdsacdessci 

pkafgnpkfalqglplncapcprrsepvvqcqyp 

sgaegsgppaalgvsmqktptyrparqlhtlch 

SSLP 


3489 


A 


718 


2073 


iaayhkalsyrghvhannrgtnnvhftpppsps 

rgilpmnprnmmnhsqvgqgigipsrtnsmsssg 

lgspnrsspsiicmpkqqpsrqpftvnsmsgfgmn 

rnqafgmnnslssnifngtdgsenvtgldlsdfp 

aladrnrregsgnptplinplagrapyvgmvtk 

paneqsqdfsihnedfpalpgssykdptssnddsk 

snlntsgrttsstdgpkfpgdkssttqnnnqqkk 

gjqvlpdgrvtnipqgmvtdqfgmiglltfikaa 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TAIKLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRDWRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 


3490 


A 


2 


2833 


FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSY>}PTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVHCLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
. peptide 
sequence 


Amino acid sequence (A=Alanine C"=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=AsparagiDe, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinei ■W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon,Mpossible nucleotide deletion, 
\=possible nucleotide iosertioa 










RFHCKLCECSFNDLNAKDLHVRGRRHRLQYRKK 

VNPDLPIATEPSSRARKVLEERMRKQRHLAEERL 

EQLRRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHATI 

YPTEQELLAVQRAVSHAERALKJLVSDTLAEEDR 

GRREEEGDKRSSVAPQTRVLKGVMRVGILAKGL 

LLRGDRNVRLALLCSEKPTHSLLRRIAQQLPRQL 

QMVTEDEYEVSSDPEANTVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARASGLQPCVrVIRVLRDLCRRV 

PTVWGALPAWAMELLVEI'CAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 


3491 


A 


2. 


1321 


FVGDGALSGCRRGRAPRVPSMAGSLPPCVVDCG 

TGYTKLGYAGNTEPQFnPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEAEDKPTYATKWPIRHGII , 

EDWDLMERFMEQWFKYLRAEPEDHYFLMTEP 

PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIVIDSGDGVTHVIPVAEG 

YVIGSCIKHIPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKAIKEKYCYICPDIVKEFAKYDVDPRKWIK 

QYTGINAINQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDWDEVIQNCPEDVRRPLYKNWLSG 

GSTMFRDFGRRLQRDLBCilVVDARLRLSEELSGG\ 

RIKPKPVEVQWTHHMQRYAV\WFGG\SMLASTP 

EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 


3492 


A 


3 

* 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLBKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEWXLYS 

LLERINPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FY\a,GNHRESN>JMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL ■ 


.3493 


A 


3 


2024 


PNGVALLHLPGAAVIPNTNYMFQDALGGRSRGS 

REESPAPSRAPASASLWRRLWVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHHTKNSEQARSHLEKAWLISQQIPQFEDVBCFE 

AASLLSELYCQENSVDAABCPLLRKAIQISQQTPY 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E==Glutamic Acid, F=Plienyialaninc, G=Glycine, H=Histidine, 
l*='lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine,R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARWGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE . 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVE.LE 

HUMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVV\LYS 

LLERJNPDHSFPVSSHCLRAAAFYVRGLFSFFQGR 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

Fm.GNHRESNNMVVPAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 


3494 


A 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHAAATCGHINLVKILVQYGADLLAVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEMRV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGANU.NARTSMDE 

MProLCEEEEFKVLLLELKVHKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNL\YR 

KEYE/GEEAILWQRSA\AEDQRTSTYNGDIREm 

TDQENKDPNPRLEK\PVLLSEFPTKIPRGELDMPV 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG . 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 


3495 


A 


327 


1078 


APNIAD'ITPNGPQGAGAVQFMMTNKLDTAMWL 

SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 

LTSALRLHQRLPHFQLSRAFLAQALLEDSCHYLL 

YSLIFVNSYPVTMSIFPVLLFSLLHAATYTKKVL\ 

DARG\SNSLPLLR\SVLDKLSANQQN[LKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLF>ffiLRIVVEHIIMKPACPLFVRRLCLQS 

lAnSRLAPTVP 


3496 


A . . 


3 ■ 


2867 


SSRTREMEEKEELRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SHHGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNVVIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

gepprgqlqpsrptrargtcsvedpllvcqkepg 

kprmvksvgsvgdspreprrtvsesviavkasfp 

ssalpprtgvalgrklgshsvascapqllgdrrv 

daghtdqpvpsgsvggparpasgprqareaslv 

vtcrtnkfrknnykwvaassksprvarralspr 

vaaenvckasagmankvekpqliadpepkprkp 

atssbcpgsapskykwkasspsasssssfrwqsea 

gskdhasqlspvlsrspsgd\rpalahsglkplsg 

etplsaykvktrtkurrrgstslpgdkksgtspa 

atakshlslrrrqalrgksspvlkktpnkglvq 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residne of 

peptide 

sequence 


Predicted end 
' nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A^Alaninc C=Cysteine, D=Aspartic Add, 
£=Glutamic Acid, F=Phenylalanine, G=GIycine, H^Histidinc, 
I^Isoleucine, K=Lj'sine, L^LiCncine, M^Metfaionine, 
N=Asparagine, P=Proline, Q=Glutamlne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Trypfophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, A=possiblc nucleotide deletion, 
\=possible nucleotide insertion 


- 








VTKHRLCRLPPSRAHLPTKEASSLHAVRTAPTSK 

VECTRYRIVKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSLVLNRLRPVASGGGKAQPGSPWWRSKG:\'R 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNGPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHE\APSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

BCPLHIBCPRL 


3497 


A 


1586 


141 


ATARDLGCARRIDRVVMESTPSRGLNRVHLQCR 

NLQEFLGGLSPGVLDRLYGHPATCLAVFRELPSL 

AKNWVMRMLFLEQPU'QAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLDLNPFRQN 

LRIALLGGGKAWSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPCITSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRJIYYP 

T/RALArNLSSGVSGAGGTVHQPGnVWETNYRL 

YAYTESELQIALIALFSEMLYPFP\NMVV\ARVTR\ 

ESVQQAIASGITAQQIEHFLRTRAHPVMLKQTPVL 

PPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDF 

ELL\LAHAPICLGVLVFE/NTPAKRLMWTPAGHS 

DVKRPWKRQKHSS 


3498 


A 


790 


190 


RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 

SLFLSNGVAANDKLLLSSNRITAIVNASVGSGQRI 

LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTVS 

MRQGRTLLNCMAG\MSRSASLCLAYLMKYHSM 

SVLLDAHTWA/TKSRRPDRPNNGFWEQLINYEFK 

LFNNNTVRMINSPVGNIPDIYEKDLRMMISM 


3499 


A 


31 


1586 


TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVGIL 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTWQRLMERIQLPWD 

SVGRLEVGTETIIDKSKAVKTVLMELFQDSGNTD 

lEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMWCGDIAVYPSGNARPTGGAGAVAlvlLIGPK 

APLALERGLRGTHME3fVYDFYKPNLASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMIFHTPFCKMVQKSL/Ua.MFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPL\DKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNSLFPGT 

WYLERVDEQHRRKYARRPV 


3500 


A 


185 


2692 


MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LLAAPGSITHQDLTEEAALNVH,QLFLEQPPPGRP 
PLRLEDFLGRTLLADDLFAAYFGPGSSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQED 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Hrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=Phenylalanine, G=G!ycine, H^EQstidine, 
I=Isolencine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Trj'ptophan, Y=Tyrosine, . 
X=Unknown, *<^top codon, /=possible nucleotide deletion, 
V=ppssible nucleotide insertion 










AKLVGALRETV VAARALDHTLARQRLG AALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDrTPASSLSFVLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARREILSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPW 

VPGQPLVFSVDGLLQKITVRfflGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN 

PGDPQPHFSHVILRGVPEGAELGQVPLEPVGPPE 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR 

AAPQPSTWPVLLELSGPSGFLAPGSKVPLSLRIA 

SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 

WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 

PPTHAFLRLLVSAPAPQDRH 


3501 


A 


1245 


5815 


RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC . 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLELILTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLFIDQVEAKWVEVKSKRRDMTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 

GGVCLNGGVCSWDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

ADYVNLALKNGAVSLVINLGSGAFEALVEPVNG 

KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GiDPKMKIHGVVAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDAKHPQMDCVDFFAIEMLDGHLYLLLDMGSGT 

KIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENICAGLVF 

PTEVWTALLNYGYVGCIRDLnDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPWMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRVKLTVNLDCIRINCNSS 

KGPETLFAGYNLNDNEWHTVRWRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGnTERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFKNIIADPVTFKTKSSYVALATLQAYT 

.SMHLFFQFKTTSLDGLILYNSGDGNDFIWELVK 

GYLHYVFDLGNGANLDCGSSNKPLNDNQWHNV 

MISRDTSNLHTVKIDTKITTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTT\CQ 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flr^t amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Alanine C=Cystcine, D=Aspartic Acid, 
E=Glntamic Acid, F=Phcn)1alanine, G=Glydne, H=Histidine, 
I==IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutainine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Onknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion 


■ - 








. EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 
PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 
STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 
FNVGTDDIAJEESNAUNDGKYHWRFTRSGGNA 
TLQVDSWPVIERYPAGRQLTIFNSQATinGGKEQ 
GQPFQGQLSGLYYNGLKVLNMAAENDANIATVG 
NVRLVGEVPSSMTTESTATAMQSEMSTSIMETTT 
TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 
DEDIDPCEPSSGGL ANPTRAGGREPYPG S AEVIRE 
SSSTtGMVVGIVAAAALCILILLYAMYKYRNRDE 
GSYHVDESRNYISNSAQSNGAWKEKQPSSAKSS 
NK>JKKNKDKEYYV 


3502 


A 


394 


72 . 


KPAHLPFTVUMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 


3503 


A 


43 


3358 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLS 

SLPPPPSRALAPTRAPDTALTIMEVAEyESPLNPS 

CKIMTFRPSMEEFREFNKYLAYMESKGAHRAGL 

AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 

SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 

LDYEDLERKYWKNLTFVAPIYGADINGSIYDEGV 

DEWNIARLNTVLDWEEECGISIEGVNTPYLYFG 

MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 

PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 

PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 

HGFNCAEStNFATVRWlDYGKVAKLCTCRKDM 

VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 

TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 

RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 

KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 

KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 

ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 

ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 

NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 

VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 

WQTKPPNFAAEQEYNATVARMIGPHCAICTLLMP 

YHKPDSSNEENDARWETKLDEWTSEGKTKPLIP 

EMCFIYSEENIEYSPPNAFLEEDGTSLLliSCAKCC 

VRVHASCYGffSHEICDGWLCARCKRNAWTAEC 

a.CNLRGGALKQTKNNKWAHVMCAVAVPEVR 

FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 

GACIQCSYGRCPASFHVTCAHAAGVLXMEPDDW 

PYVVNITCFRHKVNPNVKSKACEKVISVGQTVIT 

KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRD 

TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 

GAKYFGSNIAHMYQVEFEDGSQIAMKREDIYTL 

DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 

QAQQETYLGFWINSKKSQCNIFLSGTY 


3504 


A 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHR 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LER 

VPGIFISRLVRGGLAESTGLLAVSDEILEVNGIEV 



361 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H==Histidine, 
I^Isoleucine, K=Lysine, l/=Leucine, M^^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, 'W=Tr}'ptopIian, Y=Tyrosine, 
X=Unknown, *=Stbp codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 










AGKTLNQVTDMMVANSHNXLrVTVKPANQRNN 
WRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGmSSLPS 
LDDQEQASSGWGSRIRGDGSGFSL 


.3505 


A 


3 


2898 


SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELEDNAYDPDV 

NAKQIWIDKTVINDHICLTFTDNGNGMTSDKLH 

BCMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKbAIVFTKNGESMSVGLLSQTYL\EVIKAEHV 

VVPIVAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAnGKKGTRIUWNLRSYKNATEFDFE 

KIJKYDIRIPEDLDEITGKKGYiaCQERMDQIAPES 

DYSLRAYCSILYLKPRMQnLRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCKNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANNMGVGWGn 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YW^nEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNP\DPQFR 

NCEyPEEPEDEDLVHPTYEKTYKKTNKEKFRIRQ 

PEMIPRINAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATULLNNHQVPPQSEPESNSLKRRLS 

TRSSILNAKNRRL\SSQF\ENSVYKG\DDDDEDVU 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LWKKEETVEDEIDVKNDAVILPSCVEAEAKIHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMFTDQIKVLQQRILEMNDKYVKKETCH 

QSTETDAVFLLESINGKSESPDHMVSQYQQALEE 

lERLKKQCSALQHVKAECSQCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATbVSTSSNIEE 

SVNHMDGESLKLRSLRVKTVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQV\^QMSEISST 


3506 


A 


2 


2120 


rppeaggryraggrrqaakpsrpplpsrrrlpqg 

grtrramdrpaaaaaagceggggpnpgpaggr 

rppraaggatagsrqpsvetldsptgshvewck 

qliaatissqisgsvtsenvsrdykalrdgnkla 

qmeeapufpgesikaivkdvmyicpfmgavsgtl 

tvtdfklyfknverdphfildvplgvisrvekiga 

qshgdnscgieivckdmrnlrlayk\qeeqsklg 

ifenlnkhaFplsngqalfafsykekfpingwkv 

ydpvseykrqglpneswkiskinsnyefcdtypa 

nwptsvkdddlskvavflakgrvpvlswihpe 

sqatitrcsqplvgpndkrckedekylqtimdan 

aqshkliifdarqnsvadtnktkgggyesesayp 

naelwleihnihvmreslrklkeivypsidearw- 

lsnvdgthwleyirmllagavriadkiesgktsv 

whcsdgwdrtaqltslamlmldsyyrtikgfe 

tlvekewisfghrfalrvghgndnhadadrspif 

lqfvdcvwqmtrqfpsafefnelflitildhlys 

clfgtflcnceqqrfkedvytktislwsyinsql 

defsnpffvnyenhvlypvaslshlelwvnyyv 

rwnprmrpqmpihqnlkellavraelqkrveg 

lqrevatravssssergsspshfatsvhtlv 


3507 


A 


1 


2169 


GSSIKIRLTVLCAK>njyCKDFFRLPDPF\AKIVVD 
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SEQID . 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcinc, D=Aspartic Acid, 
£=Glutamic Acid, F=Plienylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrt)sine, 
X■=Unkno^m, *=Stop codon, /==pos5ible nucleotide deletion, 
V=possible nucleotide insertion 










GSGQCHSTD.TVKNTLDPKWNQHYDLYVGKTDSI 

TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 

TGYQRLDLCKLNPSDTDAVRGQIWSLQTRDRIG 

TGGSWDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRIEVS 

REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRJMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 

G\R}WPVTEENKKEYVRLYVNWRFMRGIEAQFL 

ALQKGFNELIPQHLLKPFDQKELELnGGLDKDDL 

NDWKS>^TRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTGSTRVPLQGFKALQGSTGVAAGPR 

LFnHLroANTDNLRKAHTCFNRIDIPPYESYEKL 

YEKLLTAVEETCGFAVE 


.3508 


A 


3 


6388 


ILYINPADLGWNPPVSSWIEKREIQTERAMLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQAGLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGR>ri'GPPGNKiaiYFIDDMNMPEVD 

aygtvqphtiirqhldyghwydrsklslkeitnv 

qyvscmnptagsftinprlqrhfsvfvlsfpgad 

alssiysnltqhlklgnfpaslqksipplidlalaf 

hqkiattflptgikfhyifnlrdfanifqgilfssv 

ecvkstwdlirlylhesnrvyrdkmVeekdfdl 

fdkiqtevlkkttddiedpveqtqspnlychfan 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENnSNVRNEVKSQ 

GLVDNRENCWKFFIDRJRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGEEPTVKQSISBCFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVIMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 



363 



wo 01/57190 



PCTAJSO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginoing 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
' nncJeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
£=GlDtaniic Add, F=PheDylalanine, G=Glycine, H^Histidine, 
I==Isolcucine, K=Lysine, L^Leucine, M==Metbionlne, 
N=Asparagine, P=ProIine, Q=Glutaminc, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, A^possible nucleotide deletion, 
V=possible nucleotide insertion 










LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGEKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAWLIENLEESIDFVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAWSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKJNEAREHYRPAAARASLLYFIMNDLSBOHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLM>3R 

EVNAVELDFLLRSPVQtGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYTVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNE WPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GnTEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3509 


A 


3 


6388 


ELYINPADLGWNPPVSSWEEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKIIPIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTERDY 

YIDPETKKFEPWSICLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKKLIYFIDDNENMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSHLTQHLKLGNFPASLQKSIPPLIDLALAF 

HQIOATTFLPTGIKJFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQIEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDL\'LFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENnSNVKNEVKSQ 

GLVDNRENCWKFFIDRJRRQLKVTLCFSPVGNKL 

RVRSRKFPATVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QK>JEDADKLIQVVGVETDKVSREKAMADEEEQ 
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SEQBD 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Crst amino 

acid residue of 

peptide 

scqoence 


Predicted end 
' nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Pbenylalanine, G=Glydne, H=HistidiDe, 
I=Isolendne, K=Lysine, L=Leudne, M=Methionine, 
N=Asparaginc, P"=Proline, Q^lutamine, R=Arginine,' S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=TjTosine, 
X=Unknown, *=Stop codon, /=possible nudeotide ddetion, 
\=possible nudeotide insertion 


- 








KVAVMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALira.NKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRWKDRS\\TKAAkVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEWCDVEPKRQALNKATA 

DLTAAQEKLAABCAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVnSLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLMVDPQLQGIKWIKNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAWLffiNLEESIDPVLGPLLGRE 

VnCKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAWSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT . 

EVKINEAREHYRPAAARASLLYFIMNDLSiaHPM 

YQFSLKAFSrVFQKAVERAAPDESLRERVANLED 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDffiGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYWGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEWAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRICFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPYIWAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

Gm:EAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 


3510 


A 


390 


3330 


AAGSGSRPPAPAARKMADLAECNIKVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVF . 

QSSTSQEQVYM)CAKKIVKDVLEGYNGTIFAYG 

QTSSGKTHTMEGKLHDPEGMGIIPRIVQDIFNYIY 

SMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSV 

HEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS . 

NRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 

LSGBCLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKS'riLFGQRAKTr 

KNWCVNVELTAEQWKKKYEKEKEKNKILRhm 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 



365 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanloe C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Add, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leudne, M'^Methionine, . 
N=Asparagine, P=Proline, QNSIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, 'W=Tryptoplian, V=Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nudeotide deletioii, 
V=possible nudeotide insertion . 










QKSATlJ^SmAELQKlKEMTNHQKKIlAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTMVKRCKQLESTQTESNKKME 

ENEKELAACQLRISQHEAKIKSLTEYLQNVEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKAKLITDLQDQNQKMMLEQERLRVEHEKLKA 

TDQEKSRKLHELTVMQDIUREQARQDLKGLEETV 

AKELQTLHNLRKLrVQDLATRVKKSAEIDS\DDT 

GGSAAQKQKISFLENNLE\QLTKSAQTSWYRDNA 

DLRCELPKLEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 

VRGGGGKQV 


3511 


A 


1 


1757 


MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDRYDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKIODHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNyYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPEGEAAGKMTTEE 

QQQRHWVAPGGPYSAETPGVPSPIAALKNVAEA 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATILAGDIKVKKERDP 


3512 


A . 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKERLSS 

lEKKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCWIDGMPPGWFBCAPGYLEISSMRRIL 

EAAEFIKFTVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EYNLRRHYQTNHSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPtQKSPVQ 

PVEDLAGNLWEKLREKIRSFVAYSIAIDEITDINN 

TTQLAIFIRGVDEMFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLBCNFCINWSKLVSVASTGTPPMVDA 

NNGLVTKLRSRYATFCKGAELKSICCIIHPESLCA 

Q\KLia»mHVMDVVVKSVNWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQIVTQMYDLIRAFLAKLCLWET 

HLTRl-n^AHFPTLKLVSRNESDGLNYIPKIAELK 

TEFQKRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVBDLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

K. I tUlnC AKlLbMx" u STYICEQLFSIMKLSKTK YC 

SQLKDSQWDSVLHIAT 


3513 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino . 

add residue of 

peptide 

sequence 


■Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£==Glutaniic Add, F^Phenylalanine, G=Glydnc, H=>Histidine, 
I=Isoleudnei K=Lysine, L=Leucine, M==Methlonine, 
N=Asparagine, P=ProIine, Q^Glutamine, R=Arginine, S=Seriiie, ' 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^ossiblc nucleotide deletion, 
V=possible nucleotide insertion 




-' 






LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

lAEGTSISEMWQNDLQPLLffiRVPGSPGSYAARQ 

HIMQRIQRLQADWVLEE)TFLSQTPYGYRSFSNn 

STLNPTAKRHLVLACHYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTi;mDNEEhn:.DESTIDM-NKILQVFVLEYL 

HL ..' 


3514 


A 


1836 


513 


FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPG>fPLPDRLGEMAGGRHRRWGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

lAEGTSISEMWQNDLQPLLBERYPGSPGSYAARQ 

HIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNH 

STLNPTAKRHLVLACHYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDBCKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFVLEYL 

HL 


3515 


A 


114 


754 


LCRDLTTTMSSKRTKIXTXKRPQRATSNVFAMF 

DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS . 

LGKNPTDEYLDAMMNEAPGPINFTMFLTMFGEK 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TT\MGDRRTDE\EVDELYREAP1\DKKGGIFNY1\E 

FTRHLETGGPKDKDDRK^]TQIPSP^fVPWLATFG 

VFLEIFLLHGP 


3516 


A 


1 


5169 . 


MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKNYYFRGAAGDHGSCfTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTABUDPSEAFQALQAALPRRGGRLGFPKRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLVVSLREENPALRKDALQEL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVnSLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFGIIPQELHSRLLDQEDYKNRTQAVEELK 

QVLGKFNPSSTPHSSLVGnSLLYNLLDDSNFKVV 

HGTLiEVLHLLVIRLGEQVQQFLGPVIAASVKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RLTEQGFVEYAVLMPSSAGGRSNHLAHGADTD 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

ctrrvlsagkgicnklpweneqpgimgenqtsts 
kdieqfstydfipsabclklsqgmpvnddlcfsrjk: 
rvsrmfqnsrdfnpdclplcaagttgthqtnls 
gkcaqlgfsqicgktgsvgsdlqflgttsshqek 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspondiag 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^Aianme C=Cysteine, D=Aspartic Acid, 
E=GIutaniic Acid, F=Phen)'lalanine, G=Glycine, H^Histidine, ■ 
I=Isoleucine, K=Lysine, L==Leucine, lVf=Methioninc, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=lInknown, *=Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 




.J 






VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

BLPSYPVSSPRTSPKHTSPLUSPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRE.SLSAQKSS\DPTGR\NHG 

\ENSQEKPPVVQLTPAL\VRSPSSRRGLNGTKPVPPI 

P\RGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LProLSELNFKDKDLDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRIMSDIFPTFG 

SKPCPTRLSSAKKKISHUEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSWWGKGVFGSLSSAPATCSQ 

SVISSVENGDTFSnCQSiEPPSGIYGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 

KDCEKKEKNSWERMRHTGTEKMASESETPTGAI 

SQYKERMPSVIHSPEIMDLSELRPFSKPEIALTEA 

LRLLADEDWEKKIEGLNFIRCLAAFHSEILNTKL 

HETNFAWQEVKNLRSGVSRAAWCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVN>fVTPARAWSLINGGQRYYGRKMLFF 

MMCHPNFEKMLEKYVPSKDLPYIKDSVRNLQQK 

GLGEffLDTPSAKGRRSHTGSVGNTRSSSVSRDA 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 

GLLNAKDFRDRINGIKQLLSDTENNQDLWGNIV 

KIFDAFKSRLHDSNSKVNLVALETMHKMIPLLRD 

HLSPIINMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEia 

ADIVTELYQRKPHATEQKVLWLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 


3517 


A 


1449 


252 


QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 
QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 
VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 
IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK 
AKFQNWMKNSLKVHNESILDQVWNIFSEASNSE 
PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 
VEQQGEVKKNKRERKEERQKKRKREKKELKLE 
NHQENSRNQKPKKRKKGQEADLEAGGEEVPEA 
NGSAGKRSKKKXQRKDSASEEEARVGAGKRKR 
RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 
GmWKGTIKAILKQAPD>nEITIKKLRKKVLAQY 
YWTDEHmSEEELLVIFNKKISKNPTFKLLKDK 
A^VK . 


3518 


A 


3 


635 


APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

LAALMASPSKAVIVPGNGGGDVTTHGWYGWVK 

KELEKIPGFQCLAKNMPDPITARESrWLPFMETEL 

HCDEKTinGHSSGAIAAMRYAETHRVYAIVLVSA 

YTSDLGDENERASGYFTRPWQWEKTKANCPYTV 

QFGSTDDPFLPWKEQQEVAD\SWKPNCTNSLTV 

ATFRTQSFMN 


3519 


A . 


81 


2277 


VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

YVKQLSQQSDGDRDLQEHRQRIQALAEETAQNL 

KJ<J>1 V Y y N YRQFIE 1 AREIS YLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=C>'steine, D=Aspartic Acid, 
E^GIutamic Acid,F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lj=L.eucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=VaUnc, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=pos5ible nucleotide insertion 










.YRYNALYSLDG1J^V\TWKD>IPPMKDMFKLLMF 

PENRIFQAENAKIKREWLEVLEDTICRALSEKRRR 

EQEEAAAPRGPPQVTSKATNPFEDDEEEEPAVPE 

VEEEKVDLSMEWIQELPEDLDVCIAQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLRIEGATLLyiHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFWW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

KEHCQQLGDIGLDLTFIIHALLVKDIQGALHSYK 

EHIEATBCHRNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCVA^LSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEIILVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVLVPVVEK 

RFEEG VGKPAKQLQDLRNASRLIRVNPESTTS VV 


3520 


A 


1706 


540 


FVAHLAWPWRADGDMEDGVLNEGFLVKRGmV 
HNWKARWFILRQNTLVYYKLEGGRRVTPPKGRI 
LLDGCTITCPCLEYENRPLLIKLKTQTSTEYFLEA 
CSREE/RRDAWAFE\ITGAIHAGQARGKVQQLHS 
LRNSFKLPPfflSLHRTVDKMHDSNTGIRSSPNMEQ 
GSTYKKTFLGSSLVDWLISNSFTASRLEAVTLAS 
MLMEENFLRPVGVRSMGAIRSGDLAEQFLDDST 
, ALYTFAESYKKKISPKEEISLSTVELSGTWKQGY 
LAKQGHmOSnVKVRRFVLRKDPAFLHYYDPSK 
EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 
GNLFKV1TK\DDTHYYIQA\SSKAE\RAE\W1GSLS 
KSLNMNKDPEGTPDSLPSLPR 


3521 


A 


3 


3063 


hasvslslgcprpcadtpgpqpqpmdlrvgqrpp 

vepppeptllalqrpqrlhhhlflaglqqqrsve 

pmrvkmeLpacgatlslvpslpafsiprhqsqsst 

pcpflgcrpcpqlsmdtpmpelqeapqeqelrql 

lhkdksigisavassvvkqklaevilkkqqaale 

rtvhpnspgipyrtlepletegatrsmlssflppv 

pslpsdppehfplrktvsepnlklr\'kpkkslerr 

knpllrkesappslrrrpaetlgdsspsssstpas 

gcsspndsehgpnpilgseallgqrlrlqetsvap 

falptvsllpaitlglpaparadsdrrthptlgpr 

gpilgsphtplflphglepeaggtlpsrlqpillld 

psgshaplltvpglgplpfhfaqslmtterlsgsg 

lhwplsrtrseplppsatappppgpmqprleqlkt 

hvqvkrsakpsekprlrqipsaedletdgggpg 

qwddglehrelghgqpeargpaplqqhpqvll 

weqqrlagrlprgstgdtvllplaqgghrplsr 

aqsspaapaslsapepasqarvlsssetpartlpf 

ttgliydsvmlkhqcscgdnsrhpehagriqsiw 

srlqerglrsqceclrgrkasleelqsvhserhv 

llygtnplsrlkldngklagllaqrmfvmlpcg 

gvgvdtdtiwnelhssnaarwaagsvtdlafk 

vasrelkngfavvrppghhadhstamgfcffns 

vaiacrqlqqqskaskilrvdwdvhhgngtqqt 

r Y QUr b V L Y ISLHRHDDGNFFPG SG A VDE VG AG S 

gegfnvnvawaggldppmgdpeylaafriwm 
piarefspdlvlvsagfdaaeghpaplggyhvsa 
kcfgymtqqlmnlaggavvlalegghdltaic 
daseacvaallgnrvdplseegwkqkpnlnair 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nudeotide 
location 
' corresponding 
to flrst amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide { 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalanine, G=Glycine, H==Histidine, 
I==Isolcucine, K=Lysine, L^Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glnta!nine, R=Arginlne, S=Scrine, 
T=Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown; *=Stop codon, /^possible nudeotide deletion, 
V=possible nudeotide insertion 










SLEAWIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 


3522 


A 

J 


9 


602 


KMAALGEPVRLEKDICRAIELLEKLQRSGEVPPQ 
KLQALQRVLQSEFCNAVREVYEHVYETVDISSSP 
EVRANATAKATVAAFAASEGHSHPRWELPKTE 
EGLGFNIMGGKEQNSPIYISRnP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVVRYTPKVLEEMESRFEKlvlRSAKRRQQT 


3523 


A 


645 


\A65 


IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQL\RPSSGRS 

GMRASAPQHHEALSLFCYEDQEAVCLICAISHTH 

RAHTVVPLDDATQEYKEKLQKCLEA\LNQKLQEI 

TRCKSSEEKKPGELKJRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 


3524 


A 


3 


698 


PMVRHEAGEALGAIGDPEVLEILKQYSSDPVIEV 

AETCQLAVRRLEWLQQHGGEPAAGPYLSVDPAP 

PAEERVDVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGAIA 

RPACLAALQAHADDPERWRE\SCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 


3525 


A 


1452 


694 . 


EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 

SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 

MTDGQLRSKRDEFWDTAPAFEGRKEIWDALKA 

AAYAAEANDHELAQAILDGASITLPHGTLCECY 

DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 

PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 

RQLHAQE/GTPKPS WQRWFFSGKLLTDRTRLQET 

KIQKDFVIQVnNQPPPPQD 


3526 


A 


123 


3441 


PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RKRSHHGLGFLCCFGGSDffEINLRDNHPLQFME 

FSSPIPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKKEQEDPNKLATSWPDYYIDRI 

NSMAAMQSLYAFDEEETEMRNQWEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGCIIALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVA VLEILG AVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNELDRSLGRYRDEVNLK 

TAIMSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVIDKLRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFELIHKKLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRILQQI 

VLQDERGVDPDLAPLENFNVKNIVNMLINENEV 

KQWRDQAEKFRKEHMELVSRLERKERECETKTL 

EKEEMMRT\LNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSM-mrDL 

rrrrr r JLr r AuCr rr'rr I'rLrr CjrGr r 1 Fr (j ArPCLO 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLNEERVPGTVW'NEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGREIAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQID 
NO: 


Method 

\ 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first ainiao 

acid residue of 

peptide 

sequtace 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
'sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K=Lysine,L=Leucine, M=Methionine, 
N=Asparagine, P=Proliae, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tr>'ptophan, Y=Tyrosine, 
X=Dnknown, *=Stop codoo, /=possible nucleotide deletion, 
V=possible nucleotide insertion 




1 






EQEDLAKDMLEQLLKFIPEKSDIDLLEEHKHErER 

IVL^RADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAILLASRELVRSKRLRQMLEVILAI 

GNFJVlhJKGQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYLIMILEKHFPDILNMPSELQHLPEAAKVN 

LAELEKEVGNLRRGLRAVEVELEYQRRQVREPS 

DKFVPVMSDnXVSSFSFSELEDQLNEARDKFAK 

ALMHFGEHDSKMQPDEFFGDFDTFLQAFSEARQD 

LEAMRRRKEEEBRRARMEAMLKEQRERERWQR 

QRKVLAAGSSLEEGGEFDDLVSALRSGEVFDKD 

LCKXKRSRKRSGSQALEVTRERAINRLNY 


3527 


A 


1445 


714 


LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 

MEKTAVAAEVLTEDCNTGEMPPLQQQIIRLHQE 

LGRQKSLWADVHGKLRSHIDALRJBQNMELREKL 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKMQSGHSSWGQRSSS 

NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 


3528 


A 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLLKKLAFYNPGRNIFLSPLSISTAFS 

MLCLGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTNFQNLEMAQKQINDFI/ESKTH 

GKFNNLIENIDPGTVMLLANYIFFRARWKHEFDP 

NVTKEEDFFLEKNSSVKVPMMFRSGIYQVGYDD 

KLSCTILEIPYQKNITAIFILPDEGKLKHLEKGLQV 

DTFSRWKTLLSRRWDVSVPRLHMTGTFDLKXT 

LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YLLLIYSEKIPSVLFLGKTVNPIGK 


3529 


A 


1 . 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEnQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEEniQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYnQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLPINLVPSSSICEDVI 

SQQLTHKDKmMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSETffMWSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPBaESSPDDDVQ 

QWFDLICKWSGLEVESASVTSQLEIEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

EETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 

FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQm 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pbenylalanine, G=Glycine, H=Hi5tidine, 
I=Isoleucine,K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, (h-Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UnIaiown, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TNPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSHIPVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTH\^VTAQDLIGNKNMQMMSIEILTLL 

ETELAKVffiSSAKGFPSFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEE\NETGFDFWS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRA\LHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASIIPPDMILTLLEGlTAnHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMXIAASASLTTDsfLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKT/DAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERVEPLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QGIHQREFKPYVVRLAKLLRKRABCKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 


3530 


A 


1 


5684 


VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETnQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYnQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAWIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKTRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEEENFSLTVNPLSDRLSL 

LSTSSETIPMWSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

QVVFDLICKWSGLEVESASVTSQLEEEAMPPKC 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

lETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
Jocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^Glutamic Acid, F^Phenylalanine, G=Glycinc, H=Histidine, 
I=Isoleucine, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagiiie, P=Proline, Q=Glutamine, R=Arglnine, S=Serine, 
T=Threonine, V=VaUne, W=Tryptophan, V=Tyrosine, 
X<=Unknowii, *=Stop codon, A=possible nucleotide deletion, 
\Fpossible nucleotide insertion 










FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 
TNPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 
SVMGKDFYSHIPVDSNHNFRSSMYIEILrSLCLYY 
MRSHYPTHVKVTAQDLIGNKNMQMMSIEILTLL 
FTELAKVIESSAKGFPSnSDMLSKCKVQKVILHC 
LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 
NFSEDEFDNGSTLQSQLLKVLQRLrV\LEHRVM\T 
ffEE\NETGFDFWS\DLEHISPHQPMTSLQYLHAQ 
SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 
STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 
YETGLSDSRPLWMASIIPPDMELTLLEGITAnHYC 
LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 
MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 
ATKl^RQQILELLGPISMNHGVHFMAAIAFVWN 
ERRQNKTTTRTKVIPAASEEQLLLVELVRSISyM 
RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 
QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 
APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 
HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 
VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 
TLLSEVLAHLLDMVFYSDEKERVIPLLVNIMHYV 
VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 
AWIOKEAFDLFMDPSFFQMDASCVNHWRAIMDN 
LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 
LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 
ESLRLPQVPILHSQVFLFFRVLLLRMSPQHLTSL 
WPTMITELVQVFLLMEQELTADEDISRTSGPSVA 
GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 
LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 
QGIHQREFKPYWRLAKLLRKRAKKNPEEDNSG 
_ RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 
NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 
CSSKARQKffiEMVEKDFLEGMIKT 


3531 


A 


553 


2470 


LISPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHBOHTGEKPYQCGSCGKAFTCHSSLTVH 

EKIHSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRIHSGEKPFKC 

NECGKAFSSHAYLIVHRRIHTGEKPFDCSQCWKA 

r aL,HSc>i_.i v HQRJLHTGEKPYKCSECGRAFSQNHCL 

IKHQKIHSGEKSFKCEKCGEMFNWSSHLTEHQRL 

HSEGKPLAIQFNKHLLSTYYVPGSLLGAGDAGLR 

DVDPIDALDVAKLLCVVPPRAGRNFSLGSKPRN 


3532 


A 


3931 


317 


HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQH) 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Uistidlne, 
I^Isoleucine, K==Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unlinown, *=Stop codon, A^possible nncleotide deletion, 
\=possible nncleotide insertion 










SLTAEEVCmiAHKVGITPPCFNLFALFDAQAQV 

WLPPNHILEIPRDASLMLYF\RHRFYSR\NWHGM 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGIPLEEVAKKTSFKDCIP 

RSFRRHIRQHSALTRLRLRNVFRRFLRDFQPGRLS 

QQMVMVKYLATLERLAPRFGTERVPVCHLRLLA 

QAEGEPCYIRDSGVAPTDPGPESAAGPPTHEVLV 

TGTGGIQWWPVEEEVNKEEGSSGSSGRNPQASL 

FGKKAKAHKAFGQPADRPREPLGAYFCDFRDIT 

HVGLKEHCVSIHRQD>JKCLELSLPSRAAALSFVS 

LVDGYFRLTADSSHYLCHEVAPPRLVMSIRDGIH 

GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRLILTV 

AQRSQAPDGMQSLRLRKFPffiQQDGAFVLEGWG 

RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 

PGETSm-IIMRGARASPRTLNLSQLSFHRVDQKEI 

TQLSHLGQGTRTNVYEGRLRVEGSGDPEEGKMD 

DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 

YVEHGPLDVWLRRERGHVPMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FKLSDPGVGLGALSREERVERIPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRTILRDLTRLQPHNLADVLTVNPDSPASDPT 

VFHKRYLKKIRDLGEGHFGKVSLYCYDPTNDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHmCYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRHSIGLAQLLLFAQQICEGMAYLHAQHYIHRDL 

AARNVLLDNDRLVKIGDFGLAKAVPEGHEYYRV 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLT 

ELLERGERLPRPDKCPCEVYHLMKNCWETEASF 

RPTFENLIPILKTVHEKYQGQAPSVFSVC 


3533 


A 


182 


3465 


FRWLDFFRGSINSQFEFGRKKENMTSPAiCFKKDK 

EIIAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEIEMDYSRNLEKLAERFLAKT 

RSTKDQQFKKDQhTVLSPVNCWNLLLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKXSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

NVRffiEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KAIKARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVE>JLDATSDKQRLMEMYJWVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

SILKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGRNLITKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLVVESCIR 

nSKHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLn 

MRYLFAFLNHLSQFSEENMMDPYNLAICFGPSL 

MSVPEGHDQVSCQAHVNELIKTmQHENIFPSPRE 



374 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E'^GIutamic Acid, F=PhEnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lyslae, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\Fpossible nucleotide insertion 










LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPffiAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGBDGLIPHQYIVV 

QDTEDGWERSSPKSEIEVISEPPEEKVTARAGAS 

CPSGGHVADIYLANINKQRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

PDKCSISGHGSLNSISRHSSLKNRLDSPQIRKTAT 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALNELR 

ELERQSSVKHTPDWLDTLEPLKTSPWAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPAT\RPKPT\VFPKTNATSPGVNSST 

SPQSTDKSGTV 


3534 


A 


1 


2640 


FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPFVPRFAGLNVAWLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRjNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHLSIYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREVVKPAAVLSKGEIWKNNPNESV 

TANAATNSPSCTRELSWTPMGYWRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAVAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPWVSVGAV 

PVLSBCECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKKIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

HTTFPKIHSRRFRDYCSQMLSKEVDACVTDLLKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

LKLKKLKCVnSPNCEKIQSKGGLDDTLHTIIDYA 

CEQNIPFVFALNRKALGRSLNKAVPVSWGIFSY 

DGAQDQFHKMVELTVAARQAYKTMLENVQQE 

LVGEP\SLRHLPAYPHRAPAALQKMAPQP/VKEK 

EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 

MNLNL 


3535 


A 


1747 


983 


LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 

RYRGPQPQPAPPSALPNSRPSPVASGREMVVLSV 

PAEVTVILLDIEGTTTPIAFVKDILFPYIEENVKEY 

LQTHWEEEECQQDVSLLRKQVXFADWPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADSIGCSTNNILFLT 

DVTREASAAEEADVHVAVWRPGNAGLTDDEK 

TYYSLITSFSELYLPSST 


3536 


A 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 

lESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 

RRRAGSPRRCAPRPRACPQGWSRARHQPGGLCL 

LLLLLCQFMEDRSAQAGNCWLRQAKNGRCQVL 

YKTELSKEECCSTGRLSTSWTEEDVNDN-ELFKW 

MIFNGGAPNCIPCKETCENVDCGPGKKCRMNKK 

NKPRCVCAPDCSNITWKGPVCGLDGKTYRNECA 

LLBCARCKEQPELEVQYQGRCKKTCRDVFCPGSS 
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SEQID 
NO: 


Method 


Predicted 
begin Qing 
nucleotide 
location 
corresponding 
to fint amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
correspondii^ 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeDcine, K=Lysinc, L=Leucine, M=Methianine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=SeriDe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=llnknon'n, *=StDp codon, ^possible nacleotide deletion, 
V=ppssible nucleotide insertion 










TCVWDQTNNAYCVTCNRICPEPASSEQYLCGND 
GVTYS\SACHLRKATCLLGRSIGLAYEGKCIKAK 
SCEDIQCTGGJCKCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGYLLE 
VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


3537 


A 


285 


2123 


IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAGIPSDLKNLL 

KVERIYLYHNSLDEFPTNLPKYVKELHLQENNIR 

TITYDSLSKIPYLEELHLDDNSVSAVSffiEGAFRD 

SNYLRLLFLSRNHLSTIPWGLPRTffiELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GIFDDLDNITQLILRNNPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGIVSTIQITTAJPNTVYPAQGQWPAPVTK 

QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTl 

fflSWKLALPMTALRLSWLKLGHSPAFGSITETTVT 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

ETPVCIETETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVTIALLALVCWYVHRNGSLF 

SKNCAYSKGRRRKDDYAEAGTKKDNSILEIRETS 

FQMLPISNEPISKEEFVIHTIFPPNGMNLYKNNH 


3538 


A 


877 


6184 


WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKffiSHETANLQDDRNSQSSSV 

SYLESKSVKSKHTKPVIHSKQNMTTDAPKKTVAA 

KYEVIHSKTKVNVKSVKRNTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVJCNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEHPGVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDtLENQATVEFHSGDKTMECEKLGLSKHTT 

hIDRTKYIDDTVKHKVKILkRESGEGRNSSDCRD 

NEKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKKKV\EKGVL 

NVHPAASASBCPSADQIRQSVRHSLKDILMKRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKNNELFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHTIEMIEKEQREVE 

RRPITKITHKGEIEIESDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEK\RKEEVDSMSKDTTSQHRQHLF 

DLNCKICIGRMAPPVDDLSPKKVKVWGVARKH 

SDNEAESIADALSSTSNILASEFFEEEKQESPKSTF 

SPAPRPEMPGTVEVESTFLARLNFIWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

TVWDYVEKIKASGTKEICWRFTPVTEEDQISYT 

LLFAYFSSRKRYGVAANNMKQVKDMYLEPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLnRQKLKRQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corrcspondiDg 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EXilutamic Acid, F==Phenylalanine, G=Glycine, H=EDstidine, 
I^^Isolencine, K=Lysine, Lf=Lencine, M^ethionine, 
N=Asparagine,P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^ossible nucleotide deletion, 
^possible nucleotide insertion 










HSACASTSHIAETPESAPPIALPPDKKSKIEVSTEE 

APEEENDFFNSFITVLHKQRNKPQQNLQEDLPTA 

VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 

LANKPLP VDDILQSLLGTTGQ VYDQ\AQS VMEQ 

NTVKEBPFLNBQTNSKIEKTDNVEVTDGENKEIK 

VKVDNISESTDKSAEIETSWGSSSISAGSLTSLSL 

RGKPPDVSTEAFLTNLSIQSKQEETVESKEKTLKR 

QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 

TTSESKDGDSCRNGEKHMLPGLSHNKEHLTEQIN 

VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 

SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 

PPPLLPPPGFG\FA\QNPMVPWPPVV\HLP\GQPQR 

MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 

ERHEKEWEQESERHRRRDRSQDKDRDRKSREEG 

HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 

KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 

DHTORTKSKR 


3539 


A 


157 


1769 


GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 

GAGGLIRSLIQCTWAPAGPARRGGRGIEDFPYLF 

FQLTHCQQRICSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 

FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EA\SLYWWYLLELGEYLSLLIRLPFDVKRKGGGP 

SSIKPRPHYDPPSTA\DFKEQVIHHFVAVILMTFSY 

SANLLRIGSLVLLLHDSSDYLLEACKMVNYMQY 

QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESI 

SNRGPFFGYYFFNGLLMLLQLLHVFWSCLILRML 

YSFMKKGQMEKDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 


3540 


A 


267 


1397 


SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKXLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVDLRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRWADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLBCP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FLRHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRDH 

LGQEVrVLSALTGENLEQLLLHLKVLYDAYAEA 

ELGQGRQPLRW 


3541 


A 


1 


8008 


DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTGLLVRIVFPSRAKRQGDI 

WNKLVEVQCLLLLEVLGGSHKHAVDGAVKKLT 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GIycine, Bt^Histidine, 
I^Isoleucine, K=Lysine, l^Leucine, M^Methiooine, 
N=Asparagine, F=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\^ossibIe nucleotide insertion 








1 


klwkenpglveqylsailslepnqnyagmlgll 

vqfctshkemdwsqhksalldfymknilmsk 

vkppkylldscapllrylshsefkdlilptiqksl: 

lrspenvtetissllasvtldlsqyamdivkglag 

hlksnsprlmdeavlalrnlarqcsdssamesl 

tkhlfailggsegkltvvaqkmsvlsgigsvshh 

wsgpssqvlngivaelfipflqqevhegtlvha 

vsvlalwcmftmevpkkltewfkkafslktst 

savrhaylqcmlasyrgdtllqaldllplliqt 

vekaasqstqvptitegvaaallllklsvadsqa 

eaklssfwqlivdekkqvftsekflvmasedal 

ctvlhvlterlfldhphrltgnkvqqyhralva 

vllsrtwhvrrqaqqtvrkllsslggfklahgl 

leelktvlsshkvlplealvtdagevteagkay 

vpprvlqealcyisgvpglkgdvtdteqlaqem 

lnshhpslvavqsglwpallarmpcidpeafitrh 

ldqnprmttqsplnqssmnamgslsvlspdrvl 

pqlistitasvqnpalrlvtreefaimqtpagely 

dksdqsaqqdsikkanmkrenkaysfkeqnele 

lkeeekkkkgikeevqltskqkemlqaqldrea 

qvrrrlqeldgeleaalglldiilaknpsgltqyi 

pvlvdsflpllksplaapriknpflslaacvmpsr 

lkalgtlvshvtlrllkpecvldkswcqeelsv 

avkravmllhthtitsrvgkgepgaaplsapafs 

lvfpflkmvltemphhseeeeewmaqilqiltvq 

aqlraspntppgrvdengpellprvamlrlltw 

vigtgsprlqvlasdtlttlcasssgddgcafae 

qeevdvllcalqspcasvretvlrglmelhmvl 

papdtdeknglnllrrlwvvkfdkeeeirklae 

rlwsmmgldlqpdlcslliddviyheaavrqag 

aealsqavaryqrqaaevmgrlmeiyqeklyr 

pppvldalgrvisesppdqwearcglalalnkls 

qyldssqvkplfqffvpdalndrhpdvrkcmld 

aalatlnthgkenvnsllpvfeeflknapndas 

ydavrqsvwlmgslakhldksdpkvkpivakl 

laalstpsqqvqesvasclpplvpaikedaggmiq 

rlmqqllesdkyaerkgaayglaglvkglgils 

lkqqemmaaltdaiqdkknfrrregalfafem 

lctmlgklfepywhvlphlllcfgdgnqyvre 

aaddcakavmsnlsahgvklvlpsllaaleees 

wrtkagsvellgamaycapkqlssclpnivpkl 

tevltdshvkvqkagqqalrqigsvirnpeilai 

apvlldaltdpsrktqkclqtlldtkfvhfidap 

sij^mprvqrafqdrstdtrkmaaqiignmysl 

tdqkdlapylpsvtpglkaslldpvpevrtvsak 

algamvkgmgescfedllpwlmetltyeqssv 

drsgaaqglaevmaglgvekleklmpeivatas 

kvdiaphvrdgyimmfnylpitfgdkftpyvgpn 

pcilkaladenefvrdtalragqrvismyaetai 

alllpqleqglfddlwrirfssvqllgdllfhisg 

vtgkmttetaseddnfgtaqsnkaiitalgverr 

nrvlaglymgrsdtqlwrqaslhvwkiwsn 

tprtlreilptlfglllgflastcadkrtiaartl 

gdlvrklgekilpenpileeglrsqksderqgvci 

glseimkstsrdavlyfseslvptarkalcdple 
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SEQID 
NO: 


Metliod 


Predicted 

beginniog 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Pbenylalanine, G==Glycine, H=Histidine, 
I^Isoleocine, K=Lysine, l/=Leucinc, M=Metiiioiiine, 
. N=Asparagine, P=Proline, Q=Clntaniine, R=Arginine, S=6erine, 
T=TlireoDine, V=VaIine, W=Tryptophan, y=Tyrosine, 
X=lInknown, *=Stop codon, /^possible nndeotide deletion, 
V=possible nucleotide insertion 










EVREAAAKTFEQLHSTIGHQALEDILPFLLKQLD 

DEEVSEFALDGLKQVMAIKSRWLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLLEATRSPEVGMRQAAAnLNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPWLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSWSITGPLIRILGDRFSWNVKAAL 

LETLSLLLAKVGIALKPFLPQLQriFl'KALQDSNR 

GVRLKAADALGKIISIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAVEmnVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 

AVLQQCLLADVSGEDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMILSSATADRIPIAVSGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLDNTBCDKNTVVRAYSDQATVNLLKMRQGEEVF 

QSLSKILDVASLEVLNEVNRRSLKiCLASQADSTE 

QVDDTILT 


3542 

- 


A 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAP 

GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 

GQPGDPGPQGPPGLDGKPGREFSEQFIRQVCTDV 

IRAQLPVLLQSGRIRKCDHCLSQHGSPGIPGPPGPI 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 

PSLCFSVIARRDPFRKGPNY 


3543 


A 


654 


194 


PARSLEKMKASWLSLLGYLWPSGAYILGRCTV 

AKKLHDGGLDYFERYSLENWVCLAYFESKFNPS\ 

AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 

HMSCSALLNPNLEKTIKCAKTTVKGKEGMGAWP 

TWSRYCQYSDTLARWLDGCKL 


3544 


A 


2 


1074 


SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 

LALLLAARWAAFEPITVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVIVFKALTGFRNNKNPBCKPLTLSLHGWAGT 

GKNFVSQMGAENLHPKGLKSNFVHLFVSTLHFP 

HEQKIKLYQDQLQKWIRGNVSACANSVFIFDEM 

DKL\HPGIIE\AIKPFLDYYEHVERVSYR\KAIFIFLS 

NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRAEMRARGSAIDEDIVTRVAEEMTFFP\ 

RDEKIYSDKGCKTVQSRLDFH 


3545 


A 


3 


273 


SAQGRSWGRFYRQIKRHPGnPMIGLICLGMGSA 

ALYLLRLALRSPDVW*SWDRKNNPEPWNRLSPN 

DQYKFLAVSTDYKKLKKDRPDF 


3546 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 
PKVPIKMQVKHWPSEQDPEKAWGARWEPPEK 
DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F^'Phenylalanine, G^CIycine, H=Histidine, 
I=Isoleucine, K=Lysine, U=Leacine, M=Metfaionine, 
N=Asparagine, P=Proline, QN^lutamine, R=Arginine, S=Serine, 
T=nireonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Vnknown, *==Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










GHHCPRPVPRPRLLGLGPSLPCPS 


3547 


A 


23 


591 


ALSTETRTPDMRRLLLVTSLVWLLWEAGAVPA 
PKVPKMQVKHWPSEQDPEKAWGARWEPPEK 
DDQLWLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 
GHHCPRPVPRPRLLGLGPSLPCPS 


3548 


A 


3 


1641 


TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRRREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRIEEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVWAG 

SSLPTSSKVECNCTQVI*CQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPVL\APSMWTRPQIKD 

FKEKIQQDADSVITVGRGEWTVRVPTHEEGSYL 

FWEFATDNYDIGFGVYFEWTDSI^NTAVSVHVSE 

SSDDDEEEEENIGCEEKAKKNANkPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 


3549 


A 


1837 


3593 


PAVLVLEPASQSRJCQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRHGRRKHVEGGMD 

LIFLKEQTLQAGDLEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAEPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSrVADSPSGMGPLFMNG 

LIAGMDLVGLQNMRNMPGIPLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LLSPPMATTCTSTAPASLSSTTKSGTAVTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEFVGLESQKRKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTN 


3550 


A 


287 


39 


QLNLMOATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 


3551 


A 


21 


3925 


GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WNEQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQnQLQVLNKAKERQLENLBEKLNESERQIRY 

LNHQLVnKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESrVMGLTKKY 

EEQVLSLQKNLDATVTALKEQEDICSRLKDHVK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E==Glutaniic Acid, F=Phenylalanine, G=Glydne, H^^Histidine, 
I=Isoleudne, K=LysiDe, l^Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, A=possible nucleotide deletion, 
V^possible nndeotide insertion 




- 






QLERNQEAJKLEKTEIINKLTRSLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTSIVQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKKIEDLHQVKKDEKSffiVETKTDTSEKPKNQ 

LWPESSTSDVVRDDILLLKNEIQVLQQQNQELKE 

TEGBCLKNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKEQQTQEKIKEKLIQQLEK 

EWQSKLDQTKAMKKKTLDCGSQTDQVTTSDVI 

SKKEMAIMIEEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALVKAEQKKWEEQHEVSVNKRISFAVSE 

AKEKWKSELENMRKNILPGKELEEKIHSLQKELE 

LKNEEVPWIRAELAKARSEWNKEKQEEIHRIQE 

QNEQDYRQFLDDHRNKINEVLAAAKEDFMKQK 

TELLLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEfflSDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGCIQKAFQDTLPLLV 

ENADPEWKKKNMAELSICDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEIKDLCCGHC 

FQELEKAKQECQDLKGKLEKCCRHLQHLERKHK 

AVVEKIGEENNKVVEELIEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

E 1 ARKMRKY YLlCLQQiLCjUDOKbOAtKjLlJVLNA 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 


3552 


A 


771 


375 


ARTRQTSGQAREPEKESPAPGGGGLAEIRSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLENYRNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 


3553 


A 


76 


72 


PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTE 

SEVYDDGTNTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSWYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHNIWDKLDGFVPAHFLGWYLKTLMIRDWW 

MCMnSVMFEFLEYSLEHQLPNFSECWWDHWM 

DVLVCNGLGIYCGMKTLEWLSLKTYKWQGLWN 

IPTYKGKMBCRIAFQFTPYSWVRFEWKPASSLRR 

WLAVCGnLVFLLAELNTFYLKFVLWMPPtHYLV 

LLRLVFF\^GGVAMREIYDFMDDPKPHKKLGP 

QAWLVAAITATELLIVVKYDPHTLTLSLPFYISQC 

WTLGSVLALTWTVWRFFLRDITLRYKETRWQK 

WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 

GEGAPTPN*PRGPAPRPLPSAPRAVCGASSRR 


3554 


A 


2 


2105 


r Ubr b ALrbrbLQ law ar Or MbRKALKKLKOtl^K 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, GM^Iycine, H=Histidine, 
I=Isoleucine, K-=Lysine, Ir=Leudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutainine, R=Arginine, S=Serine, 
T=Threomne, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unl{nown, *=Stop codon, /=possible nudeotide deletion, 
V=possibIe nudeotide insertion 










HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNTWLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIBILFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNfflRHVELSErKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLMRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3555 


A 


2 


2106 


FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIWLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRMHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 


3556 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 

VKREYLRVNWKTCEEELNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

13 TOT A A TCCfOTl TOT TTl'Ttl'^l' 'r> \Xr A TT mr"l /T^ A TVT^ A T\ A 

JllSLbAAEEbKSRlbLIPPEERWAWPEVEAPEAPA 
LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 
ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 
LSAQQILHVKQEKPYGRLLIQPGPRFH 


3557 


A 


3388 


1650 


KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNWKTCEEILNYVLVRVQPPQPGLP 
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SEQID 
NO: 


Method 


Predicted 
begin Qing 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
E=Glntaniic Add, F==PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, LF=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^>ossibte nucleotide deletion, 
\=possible nucleotide insertion 










RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQmiDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPIEPREPERIPVTVLPPEAITILEAEPIR 

MLEffiGERELPEVSKRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL : 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 


3558 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEEEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSVFKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3559 


A 


489 


2360 


IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDABCDVKEffiDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTEIPPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVBISKG AVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEKSATPSRKILDPNTGEPAPVLSSPPPADVSTF 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

rJ-.Js.voo VrivJUHAi VisJWAVvjJJAVlJALMyKArNo 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIAI««.YGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 


3560 


A 


2 


1198 


FVRELPRPRPGAATAAIMVSVINTVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILL\ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=Pfaenylalaninc, G=Glycine, H^Histidine, 
I^lsoleudne, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^^Unknown, *=^top codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










DLRGHEGPVWQVAWAHPMYGNILASCSYDRKV 

nWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLBLACGSSDGAISLLTYTGEGQWEVKKINNAHT 

IGCNAVSWAPAWPGSLIDHPSGQKPNYIKRFAS 

GGCDNLIKL>VKEEEDGQWKEEQKLEAHSDWVR 

DVAWAPSIGLPTSTIASCSQDGRVFIWTCDDASS 

NTWSPKLLHKFNDWWHVSWSITANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

GKSPQLQQDYFPRRSYRCSHRLnCLNVIGDAL 


3561 


A 


540 . 


86 


WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTETTT 

GNKQRESEAPWVRHTYIT 


3562 


A 


1920 


242 


PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNINAAKTIADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVDLAGEMLSVAEHFLEQQMHPTV 

VISAYRKALDDMISTLKKISIPVDISDSDMMLNIIN 

SSITTKAISRWSSLACNIALDAVICMVQFEENGRK 

EIDIKKYARVEKIPGGUEDSCVLRGVMINKDVTH 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITllE 

EDFmiLQMEEEYIQQLCEDIIQLKPDVVITEKGIS 

DLAQHYLMRANTTAIRRVRKTDNNRIARACGARI 

VSRPEELREDDVGTGAGLLEIKKIGDEYFTFITDC 

KDPKACTILLRGASKEILSEVEKNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVEPRTLIQNCGASTERLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETAVLLLRIDDIVSGHKKKGDD 

QSRQGGAPDAGQE 


3563 


A 


1571 


560 


GPSLLGTRGTPNPARTLQFFLIIGRRLTGRMAAV 

DDLQFEEFGNAATSLTANPDATTVNIEDPGETPK 

HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 

KSSPFWTFEYYQTFFDVDTYQVFDRIKGSLLPIPG 

KNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLS 

NFLIHLGEKTYHYVPEFRKVSIAATIIYAYAWLVP 

LALWGFLMWRNSKVMNIVSYSFLEIVCVYGYSL 

FIYIPTA^WIIPHKAVRWILVMIALGISGSLLAMT 

fwpavrednrrvalattvtivllhmllsvgcla 
yffdapemdhlptttatpnqtvaaakss 


3564 


A 


1 


328 


NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGIIFTTFWGLVGUGPWFVPKGPNRGynTML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 


3565 


A 


2 


1081 


FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 

RPFHLAAVRNEAWISGRKLAQQIKQEVRQEVEE 

WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

AVVGINSETIMKPASISEEELLNLINKLNlSfDDNVD 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHVIN 

VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 

NVWAGRSKNVGMPIAMLLHTDGAHERPGGDA 

TVTISHRYTPKEQLKKHTILADIVISAAGIPNLrTA 

DMIKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 

nucleotide 

location 

corresponding < 
to last amino 
acid residue of 
peptide 
scqnence 


Amino acid sequence (A=Alanine C=Cysleine, D=Aspartic Acid, 
£=Glutamic Acid, F=Plienylalanine, G=Gly£ine, H=Histidine, 
I=Isoleadne, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagiiie, P=ProIine, Q=GIutaraine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W— Tryptophan, Y*=Tyrosine, 
X=UnknoTvn, *=^top codon, A=possible nucleotide deletion, 
V^possible nucleotide insertion 










EGVRQKAGYITPVPGGVGPMTVAl^MKNTIIAA 
KKVLRLEEREVLKSKELGVATN 


3566 


A 


3 


1130 


SCRRGRQQQRKNfVSLSSQFAHTMAAPAQQTTQP 

GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 

QGILITCn^MNERKCVEEAYSLLNEYGDDMYGPE 

KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 

TEMRLIOO^QSVESG/yWVVFIRTLGIEPEKLVHHI 

LQDMYKTKKKKTRVILRMLPISGTCKAFLEDMK 

KYAETFLEPWFKAPNKGTFQIVYKSRNNSHVNR 

EEVIRELAGIVCTLNSENKVDLTNPQYTVWEIIK 

AVCCLSWKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTKPTSNPQVVNEGGAKPELASQATE 

GSKSNENDFS 


3567 


A 


248 


3498 


GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 

KIKVRNYWTADGDLDIGAKNVKLYVNRNLIFNG 

KLDKGDREAPADHSILVDQKMEKSEQLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAEtLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 

RDKGLRHEPGWGTSRSVhfTKERPQRATTKVHSD 

DSDIFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 

GQRLVBDIKSTWGDRHYVGLNGIEIFSSKGEPVQI 

SNIKADPPDINILPAYGKDPRVVTNLIDGVNRTQ 

DDMHVWLAPFTRGRSHSITIDFTHPCHVALIRIW 

NYNKSRIHSFRGVKDITMLLDTQCIFEGEIAKA.SG 

TLAGAPEHFGDTILFTTDDDILEAIFYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GADERIPELELPSSSPVPQVTTPEPGIYHGICLQLN 

FTASWGDLHYLGLTGLEVVGKEGQALPIHLHQIS 

ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH 

MWLIPFSPGLDHWTIRLDRAESIAGLRFWNYNK 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 

NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSENNIAAFPDSVNSLEGVGG 

DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 

VYVFDLPTTVSMIKLWNYAKTPHRGVKEFGLL 

VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTnSNQAEDQDVQMMNENQ 

HTNAKRKQSVVDPALRPKTCISEKETRRRRC 


3568 


A 


50 


1724 


AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKRIEALRRKNEALIRRYQEffiE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

l^OrijKKirO 1 rRPPOASKGGRTPPQQGORAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRRQNIEKMNEEME 

KIAEYEKNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 



385 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamie Acid, F=Fhen)'Ialaninc, G=Glycine, H=Histidine, 
l=IsoIeucine, K=Lysine, Lr^Leucine, M==Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaUnc, W=Tryptophan, V^Tyrosine, 
X=lInknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPQAKAAPRAYSDHDDRWE 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEIEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 


3569 


A 


1 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAEFGGARPREEV 

VQKEQE 


3570 


A 


I 


912 


MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

QGDIDAIFKDLSIRSVRLVRDKDTDKFKGFCYVE 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKGGPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 


3571 


A 


28 


131 


RHFFGNLCAMRAKWRKKRMRRLKRKRRKMRQ 
RSK 


3572 


A 


3 


1202 


QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLIKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GHIKDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELQQBRINEVKTEISVESKHQTLQGLAFPLQ 

PEAQRALQQLKQKMVNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESWFIYSMPGYKCSKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 


3573 


A 


49 


1869 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGWHEDLRLLLETHLPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

VRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKV 

KFNVNRVDNMUQSISLLDQLDKDINTFSMRVRE 

WYGYHFPELVKIINDNATYCRLAQFIGNRRELNE 

DKLEKLEELTMDGAKAKAILDASRSSMGMDISAI 

DLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAP 

SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 

GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 

NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 

EQVEERLSFYETGEIPRKNLDVMKEAMVQAEAE 



386 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc, H=Histidine, 
I=Isolencine, K=Lysine, Lr=Lencine, M=Methionine, 
N=Aiparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=l]nknown, *=Stop codon, A=possible nucleotide deletion, 
V^^ossible nucleotide iosertion 










EAAAEITRKLEKQEBCKRLKKEKKIU.AALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKKSTPKEETVNDPEEAGHRSRSKKKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQED 


3574 


A 


284 


2032 


CGNERTARLWVQPWSTMPQASEHRLGRTREPP 
VNIQPRVGSKLPFAPRARSKERRNPASGPNPMLR 
PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 
DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 
RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 
. STSLRRLGGFPGPPTLFSIRTEPPASHGSFHMISAR 
SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 
LNAVLQCLSSTRPLRDFCLRRDFRQBVPGGGRA 
QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 
KYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGR 
RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 
MWKRYLEREDSKTVDLFVGQLKSCLKCQACGY 
RSTTTEVFCDLSLPIPKKGFAGGKVSLRDCFNLFT 
KEEELESENAPVCDRCRQKTRSTKKLTVQRFPRI 
LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 
ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 
CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 
MQEPPRCL 


3575 


A 


1 


2408 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVK 

LIISEGRPTIEVRRCSMPSVICBHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLS^fVSNIH 

SSFATSPTGASNSKYVSADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSQSLSDAES 

ISKHMSLSYVANQEPGILQQKNAVQnSSALDTD 

NESTKDTBNTFVLGDVQKTDAFVPVYSDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTENQIPQRMTRNK 

ANTMANQSKQILASCTLLSEKDSESSSPRGRIRLT 

EDDDPQIHHPRKRKVSRVPQPVQVSPSLLQAKEK 

TQQSLAATVDSLKLDEIQPYSSERANPYFEYLHIR 

KKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

GNPLSKICIFITITPPSLSDPLKEIJRQQEVVRMKL 

RLQHSffiREKLIVSNEQEVLRVHYRAARTLANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 


3576 


A 


5 


1421 


LRLAWHDGARWPLGTPRAAATRREAAALPPVT 

LALLCLDGVFLSSAENDFVHRIQEELDRFLLQKQ 

LaKVLLrrrLooKl^KYLIHK.TAENfDLLSSFSVGE 

GWKRRTVICHQDIRVPSSDGLSGPCRAPASCPSR 

YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 

PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 

PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 

CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EXjIutamic Acid, F^Pbenylalanine, G=Glycine, H==Histidine, 
I=Isoleudne, K=Lysine, I>=Leucine, M=Metbionine, 
N=Asparagine, P=ProUne, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=IJnkiiown, *=Sfop codon, A==possibIe nucleotide deletion, 
\=^possible nucleotide insertion 










gstlqldlekgkesllekrlvaeeeedeeeveed 
gpsscsbddysellqeitdnltb:keiqiekihldts 
sfmeelpgekdlahweiydfepalktedllatf 
sefqekgfriqwvddthalgifpcrasaaealtr 
efsvlkbrpltqgtkqsklkalqrpkllrlvker 
pqtnatvarjrlvaralglqhkkkerpavrgplp 

P 


3577 


A 


102 


1998 


DTRTPGSLEMGPLQFRDVAEEFSLEEWHCLDTAQ 

RNLYR>rVMLENYSNLVFLGIVVSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPVICSHFAQDLWPEQN 

nCDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNSNRHNIRHTEKKPFKCffiCGKAFNQFSTLITH 

KXIHTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSRIL 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSSTLSSHKRSHTGEKPYKCEEC 

GKAFVASSTLSKHEIIHTGKKPYKCEECGKAFNQ 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLTK 

HKKIHTGEKPYKCEECGKAFNQSSTLrKHKKIHT 

REKPYKCEECGKAFHLSTHLTTHKILHTGEKPYR 

CRECGKAFNHSATLSSHKKIHSGEKPYECDKCG 

KAFISPSSLSRHEIIHTGEKP 


3578- 


A 

\j. 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNlQRYFGTNSVICSIGaDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPBCKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNnSDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEHIEWFRNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3579 


A 


1725 


445 


RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVTCSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGIIKGMKVELSTVNV 

RTTKPPKRRPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNnSDMKVARSATARV 

RSRPELRIQFDEGYDNYPGQEKTDDLKKRKNIFT 

GKRLNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTKEGKLWEFPINNEA 

GFDDDGSEFHEHIFLEKHLESFPKQGPIRHFMELV 

TCGLSKNPYLSVKQKVEHIEWFKNYFNEKKDILK 

ESNIQFKLRPWKFLFRNN 


3580 


A 


3673 


1619 


LYCVAPYSRHLLGRMSHLPMKLLRKKIEKRNLK 
LRQRNLKFQGASNLTLSETQNGDVSEETMGSRK 
VKKSKQKPMNVGLSETQNGGMSQEAVGNIKVT 



388 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcinc, D=Aspartic Acid, 
£=Glatamic Add, F^Pbeoylalanine, G=Glycine, H^Histidine, 
I^Isoleudne, K=Lysine, L^Leudne, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide deletion, 
\=possible nucleotide insertion 










KSPQKSTVLTNGEAAMQSSNSESKKKKKKKRK 

MVNDAEPDTKKAKTENKGKSEEESAETTKFIEN 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 

NLVNENTLKAKEMGFTNMTEIQHKSmPLLEGR 

DLLAAAKTGSGKTLAFLIPAVELIVKLRFMPRNG 

TGXaiLSPTRELAMQTTGVLKELMTHHVHTYGLI 

MGGSNRSAEAQKLGNGINnVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRILDVGFEEELKQIIKL 

LPTRRQTMLFSATQTRKVEDLARISLfCKEPLYVG 

VDDDKANATVDGLEQGYWCPSEKRFLLLFTFL 

KKNRKKKLMVFFSSCMSVKYHYELLNYIDLPVL 

AIHGKQKQNKRrriFFQFCNADSGTLLCTDVAA 

RGLDEPEVDWIVQYDPPDDPKEYIHRVGRTARGL 

NGRGHALLBLRPEELGFLRYLKQSKVPLSEFDFS 

WSKISDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 

YDSHSLKQIFNVNNLNLPQVALSFGFKVPPFVDL 

NVNSNEGKQKKRGGGGGFGYQKTKKVEKSKIF 

KHISKXSSDSRQFSH 


3581 


A 


23 


453 


LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLIRMDLLYWQFTIYTITFCFSHLSG 

RLTLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSWHLPIVYKGSMT 

QVSPH 


3582 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVR^JMSPDEIKIPPEPPGRC 

SNHLQDKIQKLYERKJKEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAl 

PVXmQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3583 


A 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEIKIPPEPPGRC 

SNHLQDMQKLYERKKEGMDMNYIIQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTBOEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTmQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 


3584 


A 


3 


1139 


PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

LRAEGKADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGEhfnCEATQnDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELLDSLPMDAYTHGCILHPELTV 

DSMIPAYATTRIRSQIGNTESELKKLAEENPDLQE 

A I JLAls.l^KKLK.S)KX.LUHDN VKYLKKlLDbLEKVL 

DQVETELPRRNEETPEEGQQPWLCGESFTLADVS 

LAVTLHRLKFLGFARRNWGNGKRPNLETYYERV 

LKRKTFNKVLGHVNNILISAVLPTAFRVAKKRAP 

KVLGTTL WGLLAG VGYFAFMLFRKRLGSMILA 

LRPRPNYF 1 



389 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^PIienylalanine, G=Glycine, H^^Histidine, 
I=IsoIeucine, K'=Lysine, Lr^LeDcine, M=Metfaionine, 
N=Asparagine, P=Prollne, QKilutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
\=possible nucleotide insertion 


3585 


A 


1 


1777 


RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTILHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQIRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLffiPLLKEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWGIVGYTSKETPF 

LSNPGTNLVFEDEITALQPEVDKLKTLNVNKIIAL 

GHSGFEMDKLIAQKVRGVDVWGGHSNTFLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPVVQAyAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSffEDPS 

IKADINKWRIKLDNYSTQELGKTIVYLDGSSQSC 

RFRECmiGl^ICDAMINNNLRHTDEMFWNHVS 

MCILNGGGIRSPIDERNNGTITWENLAAVLPFGG 

TFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHWYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKVILPNFLANGGDGFQMIKDELLRH 

DSGDQDINWSTYISKMKVIYPAVEGRIKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 


3586 


A 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMIKEESDYHDLESWQQVEQN 

LELMTKRAVKAENHWKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTWKQNADVALQNLRWM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 


3587 


A 


88 


1639 


GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTIFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSIVWVHAFPELFLS 

CLNHPDKKIYAYSSMILFTSLNHERMKELEENLN 

lAIDVIDAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTKDDIPVF 

LRHAELIASTFVDQCKTVLKLASEEPPDDEEALA 

TIRLLDVLCEMTVNTELLGYLQVFPGLLERVIDL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CNISDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKLILKSTRD 

TPKP 


3588 


A 


3 


1462 


DSPRNRFEELGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQWTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

KLOrCJNQFlKHKMVTALG IHWHPEHFCCVSCGE 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

ILDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTTCLRPL'EKGSFQERAGKPYCQP 

CFLKLFG 
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SEQID 
NO: 



Method 



Predicted 
beginning 
nncleotide 
location 
corresponding 
to first amino 
acid residue or 
peptide 

seqnence 



Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 



Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glu(amic Acid, P=Phenylalaiune, G=Glycine, H^Histidine, 
I^Isoleudne, K=Lysine, I^Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=<3lulamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, \V=Tryptophan, Y=TjTosine, 
X=Unknown, *=Stop codon, /=passible nucleotide deletion, 
V=possible nucleotide insertion 



3589 



226 



6793 



SPPKKSRKCNLSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRIENIATEREFDPEEFYYLLEAAEGHA 

KEGQGIKTDIPRYnSQLGLNKDPLEEMAHLGNY 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

ia.ISNGAYGAVYF\aiHKESRQRFAMKKINKQNL 

ILRNQIQQAFVERDBLTFAENPFWSMYCSFETRR 

HLCMVMEYVEGGDCATUSIKNMGPLPVDMARM 

YFAETVLALEYLHNYGIVHRDLKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVmRQGYGKPVDWWAMGn 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLITLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFDPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRTTQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMffGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPrVfflSSGBCNYGFT 

IRAZRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITHINGEPVHGLVHTEVIELLLKSGNKVSI 

TTTPFENTSIKTGPARRNSYKSRMVRRSKKSKkK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPSSSAPNSPAGSGHIRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTIVRHrVRPKSAE 

PPRSPLLKRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VWKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPKAVERSSTFENKASMQEAPPLGSL 

LKDALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELLRCEKLDSKLANIDYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRESSERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKNLLSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGBPCEKELGKVRR 

GVEPKPEALLARRSLQPPGIESEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQNKASDGIGQGEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nacleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
r^udeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A°=Alanine C=Cysteine, D=Aspartic Acid, 
EXJIutamic Acid, F=PhenylaIanine, &=Glycine, H=Histidine, 
l^Isolencine, K=Lysine, L=Leacine, M=iVfetbionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkooti'n, *^top codon, /^possible nucleotide deletion, 
V=possibIe nucleotide insertion 










KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTORRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVBIEASAASSDTSSAK 

AAGGMLELPAPS>mDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFWRQRRGKESLRSSPHK 

KAL 


3590 


A 


3 


935 


RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDESDGK 

ETLETISNEEQTPLLBmNPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLrVLTNVTKNIVAFKVRTTAPEKYRVKPSNSS 

CDPGASVDIVVSPHGGLTVSAQDRFLIMAAEME 

QSSGTGPAELTQFWKEVPRNKVMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 


3591. 


A 


303 


2 


GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 


3592 


A 


1052 


1779 


GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVrVSPQTVQVKGENGNLVITPDGNVMYNGK 

QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSR 

VEKARIALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRIIETRSDGLTFHYKAIDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIQTEWKKQEKDFQQFGBCDVCSRVVTLE 

DSRKALVGNLK 


3593 


A 


3 


1837 

- 


LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS 

QETDFSEASLLEKQQEVHS AGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHTITGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRIHTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQRIHTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKEGGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGNHPNWGCCLTISSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLERISIHIQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSWL 

iJlJ 


3594 


A 


39 


261 


RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSimSVKGPVREGDVLTLLESERE 

ARRLR 


3595 


A 


973 


6S 


GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Ih-edicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>=Glutamic Add, F^Phenylalaoine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L==Leucine, M=Methionine, 
N=Asparaginc, P=Proline, Q^KSIutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X=IInknown, *^top codon, possible nucleotide deletion, 
V=|)Ossible nucleotide insertion 










DKEWMPVTKLGRLVKDMKIKSLEEIYLFSLPIKE 

SEHDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 

KAr VAlCjlJ I N(jHVuLOVKCSK±.V ATAIRGAIILA 

KLSIVPVRRGYWGNKIGKPHTVPCKVTGRCGSV 

LVRLIPAPRGTGIVSAPVPKKLLMMAGIDDCYTS 

ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 

TVFTKSPYQEFTOHLVKTHTRVSVQRTQAPAVA 

TT 


3596 


A 


106 


2960 


derrvgaadmfgrsrswvggghgktsrnihsl 

dhlkylyhvltknttvteqnrnllvetirsiteil 

iwgdqndssvfdffleknmfvfflnilrqksgry 

vcvqllqtlnilfenishetslyyllsnnyvnsii 

vhkfdfsdeeimayyisflktlslklnnhtvhff 

ynehtndfalyteaikffnhpesmvriavrtitl 

nvykvsldnqamlhyirdktavpyfsnlvwfig 

shvielddcvqtdeehrnrgklsdlvaehldhl 

hylndilimceflndvltdhllnrlflplyvysl 

enqdkggerpkislpvslyllsqvfliihhaplvn 

slaevblngdlsemyakteqdiqrssakpsircfi 

kptetlerslemnkhkgkrrvqkrpnyknvgee 

edeekgptedaqedaekakgteggskgiktsges 

eeiemvimersklselaastsvqeqnttdeeksa 

aatcsestqwsrpfldmvyhaldspdddyhalf 

vlcllyamshnkgmdpekleriqlpvpnaaekt 

tynhplaerliriiynwaaqpdgkirlatlelscl 

llkqqvlmsagcimkdvhlaclegareesvhlv 

rhfykgedifldmfedeyrsmtmkpmnveylm: 

mdasillpptgtpltgidfvkrlpcgdvektrrai 

rvffmlrslslqlrgepetqlpltreedlktddv 

ldlnnsdliactvitkdggmvqrslavdiyqms 

LVfcPDVSKJLuWGVVKFAGLLQDMQVTGVEDDS 

ralnitihkpassphskpfpilqatfifsdhirciiak 

{^KJLAKGKJt^ARKMKMQRlA ALLDLPIQPi I E VLG 

fglgsststqhlpfrfydqgrrgssdptvqrsvf 
asvdkvpgfavaqcinehsspslssqsppsasgsp 
sgsgstshcd3ggtsssstpstaqspagighvtq 


3597 


A 


427 


277 


gvrriqhhwaqmhecnvhtyaslfclfllhtg 
klcclnshrhfhcikysk 




A 

A 


X 




rKrKIKJCArAMYLEHYLDSlENLPCELQRNFQL 

mreldqrtedkkaeidilaaeyistvktlspdqr 
verlqkiqnayskckeysddkvqlamqtyemv 

DKHIRRLDADLAKFEADLKDKMEGSDFESSGGR 

GLKKGRGQKEKRGSRGRGRRTSEEDTPKKKKH 

KGG 


3599 


A 


2 


3907 


KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKYIVSVGYQH 

DMIVNVWAWKKNIWASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGBCKADSTFCITSSG 

SSFITCSSDNTIRLWNTESSGVHGSTLHRNILSSDL 

IKIIYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 

VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

DDcIeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end ' 
nucleotide 
location 
corresponding' 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glntamic Add, F=Phenylalanine, G=Glydne, H=BBstidine, 
I==IsoleudDe, K^Lysine, L=Leudne, M=Methionine, 
^f=Asparagine, P=Proline, Q=Glutamine, R=Argimne, S=Serine, 
T=Threonine, V=VaUne, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, A^possible nndeotide deletion, 
\=possible nudeotide insertion 










MSCGADKSIYFRTAQKSGDGVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQ 

KKLFKGSQGEDGTLDCVQTDPSGIYIATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGrVYPEP 

SDNPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTERWACLGEGTTPKPRTECQAHPGPSSPCA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 


3600 


A 


1688 


916 


IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 

QFSFQQGGWGASLADRLVRKCDVLNRGFSGYN 

TRWAKIILPRLIRKGNSLDIPVAVTIFFGANDSAL 

KDENPKQHIPLEEYAANLKSMVQYLKSVDIPENR 

VILITPTPLCETAWEEQCnQGCKLNRLNSWGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 


3601 


A 


44 


223 


VHFPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 
QIAHLFNYGLSSFLREFIFLIKLLQ 


3602 


A 


37 


1124 


VPKPASGKRRLEFRPQDSKACAATPHSPGRITSR 

TRGSQKVRSVPPRLPWAQASASTDWEQLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGEl 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPRQIPSHIVRLBCPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSETINPKTCS 

PBCEYLETFIFPVLLPGMASLLHQAKKEKCFEWL 

QMTPSGGKACVWGHLPSSSHTI 




A 


Zoo 


JO/ 


JNlor^lliAJl V aanr o V lonbmiJorOyr Kr JlUN 

RMQKKYWKTKQVFIKATGKKEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 


3604 


A 


103 


2440 


QPRRRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nocleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine OCysteine, D=Aspartic Acid, 
E=Glutam!c Acid, F=rhenylalanine, G=Glycine, H=Histidine, 
I=Isoleucinc, K=Lysiiie, L=Leucine, M=Methianine, 
N=Asparagine, P=Proline, Q^lntamine, R=Arginine, S=Serine, 
T=Threoninc V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNIEKIIHVTTKLVPSIKRLHNCDTILKHTLN 

SHNHNRNSATKNLGKIFGNGNNFPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPTHHQKIHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNLITHQKJHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSfflQ 

KTHTGEKHYECNECGKAFTRKSALRMHQRIHTG 

EKPYVCADCGBCAFIQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRTNLTTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTHTGEKPYKCNGCGKAFIWKSRLKIH 

QKSfflGERHYECKDCGKAFIQKSTLSVHQRIHTG 

EKPYVCPECGKAFIQKSHFIAHHRIHTGEKPYECS 

DCGKCFTKKSQLRVHQKIHTGEKPNICAECGKAF 

TDRSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

MHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

HTGKKPYACTECQKAFTDRSNLIKHQKMHSGEK 

RYKASD 


3605 


A 


3 . 


322 . 


SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGl 
TKPAIRRLARRGGVKRISGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDWYALKRQGRT 
LYGFGG 


3606 


A 


1 


1749 


VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYKWSQCRRESSHKHTFFHPRVCTGKRLYESS 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKIfflRVHTGERPYECGECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

CGKSFSLKCGLIQHQLIHSGARPFECDECGKSFSQ 

rttlnkhhkvhtaerpyvcgecgkafmfkskl 
vrhqrthtgerpfecsecgkffrqsytlvehqki 
htglrpydcgqcgksfiqkssliqhqvvhtgerp 
yecgkcgksftqhsglilhrkshtverprdsskc 

UKrYbFRSlvlV 


3607 


A 


92 


331 


AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKT 
CWQRFKTGAFSERQGSTIGVDFTMKTLEIQGKR 
VKLQIWDTAGQER 


3608 


A 


545 


379 


AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 


3609 


A 


118 


873 


VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 

GHSYCKGCLVSl^YHLDTKVRCPMCWQVVDGS 

SSLPNVSLAWVffiALRLPGDPEPKVCVHHRNPLS 

LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 

MKEELAALFSELKQEQKKVDELIAKLVKNRTRIV 

NESDVFSWVIRREFQELRHPVDEEKARCLEGIGG 

HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatiOD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide- 
sequence 


Amino acid sequence (A=Alanioe C=Cysteine, D=Aspartic Add, 
E=Glntaniic Add, F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, Lr=Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glntamlne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 










GNEDHHEFIWKFHSMASR 


3610 


A 


2 


987 - 


DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDIHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESRQITQVYGFYDECQTKYGNANAWRYCTK 

VFDMLTVAALIDEQILCVHGGLSPDIKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAPNYCYRCGNIASIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 


3611 


A 


245S! 


869 


AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAWLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLUSERIQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVrVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEKPFECHECGKAFIQSAN 

LWHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHSLEKTFKCNECEKAFSYSSQLARHQKVHITE 

KCYECNECGKTFTRSSl>}LrVHQRIHTGEKPFACN 

DCGKAFTQSANLIVHQRSHTGEKPYECKECGKA 

FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 

IVHQRIHSGDLPYVCNECGKAFTCSSYLLIHQRIH 

NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 

YECEKCGAAFISNSHLMRHHRTHLVE 


3612 


A 


318 


2245 


SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

AVYFSQEEWGLLDEAQRCLYHDVMLENFSLMA 

SVGCLHGIEAEEAiPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKIPLSDNLFPCKDVEKDFPTILGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIH'mE 

RSYVCIECGKSLSSKYSLVEHQRTONGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SFIHSYDRIRHQRVHTGEGAYQCSECGKSFIYKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCnCGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKQTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQRIHTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

RIHTGEKPYECGKCGKAFNKRYSLVRHQKVHIT 

EEP 


3613 


A 


817 


3345 


NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQHGSSQYSGTYASFIPSQLIPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspondiDg 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIaDine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Flienylalanine, G^GIycine, H=Histidine, 
I=Isolencine,K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutaniine, R=Argimne, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
\==possib)e nucleotide insertion 










PAQQNQYVHISSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQWMQYADSGSHFVPREATK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGK5VPHPYESRHVVVHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMVQAQIHLPWQSVASPAAAPPTLPPYFMK 

GSnQLANGELKKVEDLKtEDFIQSAEISNDLKIDS 

STVERIEDSHSPGVAVIQFAVGEHRAQVSVEVLV 

EYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 

VCISLTLKNLKNGSVKKGQPVDPASVLLKHSKA 

DGLAGSRHRYAEQENGINQGSAQMLSENGELKF 

PEKMGLSAAPFLTKIEPSKPAATRKRRWSAPESR 

KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK 


3614 


A 


3 


114 


FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 


3615 


A 


3 


1603 


DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKKIENLQEQL 

RDKEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERTIERLKEQRDRDEREKQEEIDNYKKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKKDSRLKTLEIALEQKKEECLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRLLEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKKVANLKHKEQVEKKKSAQIVILEEARRR 

EDNLNDSSQQLQDSLRKKDDRDEELEEALRESVQ 

ITAEREMVLAQEESARTNAEKQVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETHLTNLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEEVAALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA 


3616 


A 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSALNRKEGLRLAEDRSKDPHDPHKI 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDIL 

DELLGNMVLAPLPDPGPPSLAVAPEPCPQPLRSPS 

LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

YRGRQVFQQTISCPEGLRLVGSEVGDRTLPGWP 

VTLPDPGMSLTDRGVMSYVRHVLSCLGGGLAL 

WRAGQWLWAQRLGHCHTYWAVSEELLPNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

WPTCLRALVEMARVGGASSLENTVDLHISNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 


Jul / 


A 

A 




304 


RGGLLSKMARVLKAAAANAVGLFSRLQAPIPTV 
RASSTSQPLDQVTGSVWNLGRLNHVAIAVPDLE 
KA A AFVTirMIT GADV'sFAVPI PFWr?V<:wrr\/TJT n 

x^-fTLTLn-i 1 JT^Nil^vjrtV^ V OXjrt V fljx JC<ri\J V O V Vr VlNi-flJ 

NTKMELLHPLGRDSPIAGFLQKNKAGGMHHICIE 
VDNINAAVMDLKKKKIRSLSEEVKIGAHGKPVIF 
LHPKDCGGVLVELEQA 


3618 


A 


3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAWRCTLSANMYVDEILVWCASEL 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

Ducleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
EXilutamic Add, F=PhenyIalanine, G=Glydne, H^Histidine, 
I=Isoleudne, K^Lysine, L=LeBdne, M==Metbianine, 
N=Asparagine,P=ProIine, Q=Glntaminc, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nudeotide deletion, 
\=possible nudeotide insertion 










NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSBa'LYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKXm^ALMCMLREIGKHINMIXjTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQHVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRMEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPWLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVIIKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPffiSQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEEE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMIIAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVELHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRWSDRWLSCETQLPVSFR 

HLILPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DICFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NmSTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQffiRPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFNISHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQWVASRSLCWGMNVAAHLVnM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTnEL 

FSMSLNAKTKVRGLIEnSNAAEYENlPIRHHEDN 

r T 1>/^T A /~w\7TiXJli^ "XTXTTif T!Tvrr\mn/T*'''nkrT t t a ttt 

l-.LK<s2LAl^KVrHJ?U^NNrKi"r*JDrHvKTWLLLQAH^ 

SRMQLSAELQSDTEEBLSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEEKNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VWLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=GlutaDiic Add, F=Phenylalanine, G=Glycine, H-Histidine, 
I=IsoIeudne, K=Lysine, l^Leucine, M=Methionine, 
N=Asparagine,P=Proline, Q=Glutainine, R=Arglnine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possiblc nudeotide deletion, 
V^ossible nucleotide insertion 










IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 


3619 


A 


■3 


5992 


DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 

DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDBDAFWLQRQL 

SRFYDDATVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGBCMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTNVALMCMLREIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEElSATQnVCTPEKWDnTRKGGERTYTQLV 

rliildeihllhddrgpvlealvarairniemtqe 

dvrliglsAtlpnyedvatflrvdpakglfyfdn 

sfrpwleqtyvgitekkaikrfqimneivyekim 

ehagknqvlvfvhsrketgktarairdmclekd 

tlglflregsastevlrteaeqcknlelkdllpy 

gfaihhagmtrvdrtlvedlfgdkhiqvlvsta 

tlawgvnlpahtvnkgtqvyspekgrwtelga 

ldilqmlgragrpqydtkgegilitshgelqyyl 

sllnqqlpiesqmvsklpdmlnaeivlgnvqna 

kdavnwlgyaylyirmlrsptlygishddlkgd 

plldqrrldlvhtaalmldknnlvkydkktgn 

fqvtelgriashyyttndtvqtynqllkptlseie 

lfrvfslssefknitvreeeklelqkllervpipvk 

esffiepsakinvllqafisqlklegfalmadmvy 

vtqsagrlmraifeivlnrgwaqltdktlnlck 

mtokrmwqsmcplrqfrklpeevvkiaekknfp 

ferlydlnhneigelirmpkmgktihkyvhlfpk . 

lelsvhlqpitrstlkveltitpdfqwdekvhgss 

eafwilvedvdsevilhheyfllkakyaqdehli 

tffvpvfeplppqyfirvvsdrwlscetqlpvsfr 

hlujpekyppptelldlqplpvsalrnsafeslyq 

dkfpffnpiqtqvfntvynsddnvfvgaptgsgk 

ticaefailrmllqnsegrcvyitpmrlwqeqvy 

mdwyekfqdrlnkkvvlltgetstdlkllgkg 

NfflSTPEKWDILSRRWKQRKNVQNINLFVVDEV 

hliggengpvlevicsrmryissqierpirivalsss 

lsnakdvahwlgcsatstfnfhpnvrpvplelhi 

qgfnishtqtrllsmakpvfhaitkhspkkpvrvf 

vpsrkqtrltaidilttcaadiqrqrflhctekdl 

ipyleklsdstlketllngvgylheglspmerrl 

veqlfssgaiqvwasrslcwgmnvaahlviim 

dtlyyngkihayvdyprfdvlqmvghanrplq 

hcmhdhfnaeivtktienkqdavdyltwtflyr 
rmtqnpnyynlqgishrhlsdhlselveqtlsdl 
eqskcisffidemdvaplnlgmiaayyyinyttiel 

FSMSLNAKTKVRGLffinSNAAEYENIPIRHHEDN 
LLRQLAQKVPHKLNNPKFNDPHVKTNLLLQAHL 
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SEQfl) 
NO: 


Melhod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=4Iistidine, 
I^lsolencine, K=Lysine, L=Lcucine, M=Metliiomne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=pos5ible nucleotide deletion, 
\=possible nucleotide insertion 










SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEWDKDSIRSGGP 

VWLVQLEREEEVTGPyiAPLFPQKREEGWWVV 

IGDAKSNSLISKRLTLQQKAKVKLDFVAPATGG 

RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 

DSD . 


3620 


A 


1205 


323 


VDCMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 

VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 

LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

EKPQQHQKTKMrvLGFSNPINWWTRIKAFLIWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAANIDEI 

VFTSTGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIARIEHSKLLE 


3621 


A 


2 


2995 


SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPTTTPPPQTPPLPPLPPIPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLi>PELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDnGIIGEGTYGQVYKAKD 

KDTGELVALBCKVRLDNEKEGFPITAIREIKILRQL 

IHRSWNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDIKCSNILLNNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVIKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPQPAPGKVESGAGDAIGLADITQQLNQSELAVL 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

pagsleennsdknsgpqgprrtptmpqeeaagrs 


3622 


A 


16 


390 


TPERGSAYPETAAVRRPAGECPITMSDLEAKLST 
EHLGDKIKDEDIKLRVIGQDSSEIHFKVK^^TTPLK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 


3623 


A 


2 


1544 


PPPAPGPDGLNEGCLHRLSMPHQRPRTCAMhfPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanint C=Cysteine, D=Aspartic Add, 
EXjIutaniic Add, F=Phenylalanlne, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, I^Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argimne, S=Serine, 
T=Threomne, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l)nbno\m, *=Stop codon, ^possible nucleotide deletion, 
V=possible nucleotide insertion 










LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASILDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSFIXMRDERGLPAMNNLYSPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

QyA I SGQLbbLN 1 KEVAQKl 1 AELKJR.YSIPQAIFA 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWKWLQEPEFQRMSALRLAACKRKEQEPNKDR 

NNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQ 

ITISQQLGLELTTVSNFFMNARRRSLEKWQDDLS 

TGGSSSTSSTCTKA 


3624 


A 


27 


2152 


SARKAEAATSGTAARDGSVGKNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKIT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEmCNKRHKTVLTELQAKIARLTKRF 

EAAKEDLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYKNAGTVRQMLESKRNVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSGNVEFISVQSPPT 

VSGLTKNPVSLPSLPNPTKPNNVPSVPSPSIQRNP 

TASAAPLGTTLAVQAVPTAHSrVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKIPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDASVSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

PVSRPLQPIQPAPPLQPSG VFTSGPSQ 1 1 IHLLPTA 

PTTVNVTHRPVTQVTTRLPVPRAPANHQVVYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAIVLVMECPGGGSKLCHC 


3625 


A 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAJHTDIMDDWLDCAFTC 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE ' 

AVQrNPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALrVGMVRLTQHLSLLCEKYSTW 

RDEVGGKVPYIEQHQFKLCIMRRSKGRAEKS 


3626 


A 


9 


921 


SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG 

FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 

EEEQLRAQGSTDYFLSSGDKIRFFFEKGVFDEKG 

NrLVFFbKiilNKiGHALHAHDPVFKSlTHSFKV 

1j/vi\o JLjVji-^v^ivir V V V v^oivi I IT rv.v^x fiT OvJii V o± rn^u 

ASFLYTEPLGRVLGVWIAVEDATLENGCLWFIPG 

SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 

FVPTPVQRGALVLIHGEVVHKSKQNLSDRSRQA 

YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


3627 


A 


231 


644 


INSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 



401 



wo 01/57190 



PCT/USOl/04098 



SEQID 

Na 


Method 


Predided 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanjne C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, I>=Leucine, M=Metliioniiie, 
N=Asparagine, P=Proline, Q=Glatamine, R=Ai^nine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^=possibIe nucleotide deletion, 
V=possible nucleotide insertion 










NPGIFYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 


3628 


A 


2 


810 


GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

DWLMGKSKAKPNGKKPAAEERKAYLEPEHTKA 

RITDFQFKELWLPREIDLNEWLASNTTTFFHHIN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKETAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 


3629 


A 


699 


1604 


CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVroRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 


3630 


A 


423 


1 


PAKVLTLDIYLSKTEGAQVDEPVVITPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 


3631 


A 


2082 


674 


WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGGRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEKAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQ A 

VQTDFSSDPLQKWCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQIPHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEVVSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLIIVTILLLQSAFPG 

FL 


3632 


A 


942 


40 


PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQTICFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

CCQPTCGGSSCCSQTCNESSYCLPCCRPTCCQTT 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 


3633 


A 


605 


3004 


GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLALIMAYQ 
RAYPLEVTKLIYCSRTVPEIEKVIEELRKLLNFYE 



402 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=<;intaniic Add, F=Phenylalanine, G=GIydne, H^Histidine, 
I^Isoleodne, K=Lysine, L^Leodne, M^Metfaionine, 
N=Asparagine, P=Proline, QKSIutamine, R=Arginine, S=5Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










KQEGEKLPFLGLALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANWVYSYHYLLDPKIADLVSKELARK 

AVWFDEAHNEDNVCIDSMSVNLTRRTLDRCQG 

NLETLQKTVLRIKETDEQRLRDEYRRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHWQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TLLANFATLVSTYAKGFmiEPFDDRTPTIANPIL 

HFSCMDASLAIKPVFERFQSVnXSGTLSPLDIYPK 

ILDFHPVTMATFTMTLARVCLCPMnGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

FTSYQYMESTVASWYEQGILENIQKNKLLFIETQ 

DGAETSVALEKYQEACENGRGAILLSVARGKVS 

EGIDFVHHYGRAVIMFGVPYVYTQSRILKARLEY 

LRDQFQIRENDFLTFDAMKHAAQCVGRAIRGKT 

DYGLMVFADKRFARGDKRGKLPRWIQEHLTDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLIORIEQIAQQL 


3634 


A 


159 


384 


LKMSSKTASTNNIAQARRTVQQLRLEASIERIKV 
SKASADLMSYCEEHARSDPLLIGIPTSENPFKDKK 
TCIIL ' 


3635 


A 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMTGT 
LETQFTCPFCNHEKSCDVKMDRARNTGVISCTV 
CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 


3636 


A 


48 


282 


DHLKSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANAI 
IHLFCFS 


3637 


A 


1 


1248 


ARAGSWGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFASICTMGAN 

ASALEKEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLAYKSQPRKKESLLTCLADL 

FHSIATQKKKVGVIPPKKFITRLRKENELFDNYM 

QQDAHEFLNYLLNTIADELQEERKQEKQNGRLPN 

GNIDNENNNSTPDPTWVHEDFQGTLTNETRCLTC 

ETISSBQ5EDFLDLSVDVEQNTSITHCLRGFSNTET 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRVVFPLELRLFNTS 

GDATNPDRMYDLVAWVHCGSGPNRGHYIAIV 

KSHDFWLLFDDDIVEBaDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD 


3638 


A 


11 


630 


PAGIPVSTISSDRRASTDLTRKMKPDETPMFDPNL 

LKEVDWSQNTATFSPAISPTHPGEGLVLRPLCTA 

DLNRGFFKVLGQLTETGWSPEQFMKSFEHMKK 

SGDYYVTWEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDVWSDECRGKQLGNLLLSTLTLLSKKL 

NCYKITLECLPQNVGFYKKFGYTVSEENYMCRR 

ITT V 


3639 


A 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPWLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 



403 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 

- 


Method 


Predicted 

beginning 

nncleotide 

locatiDn 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIntamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lencine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIatamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unlino\vn, *=Stop codon, ^=possible nucleotide deletion, 
V^ssible nucleotide insertion 










CWLSLGHPFFYRRHITLRLGALVAPWSAFSLAF 

CALPFMGFGKFVQYCPGTWCFIQMVHEEGSLSV 

LGYSVLYSSLMALLVLATVLCNLGAMRNLYAM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVIYRAYYGAFKDVKE 

KNRTSEEAEDLRALRFLSVISIVDPWIFIIFRSPVFR 

IFFHKIFIRPLRYRSRCSNSTNMESSL 


3640 


A 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 

AIEADCLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIEKIVEIDAinGCAjMSGLIADAKTLIDKARVETQ 

NHWFTYNETMTVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATNIELATVQPGQNFHMFTK 

EELEEVIKDI 


3641 


A 


2 


1254 


PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALKTVPT 

YERMIVFRLGRIRTPQGPGMVLLLPFDDSFQRVDL 

RTRAFNVPPCKLASKDGAVLSVGADVQFRIWDP 

VLSVMTVKDLNTATRMTAQNAMTKALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

VEAVLQPPQDSPAGPNLDSTLQQLALHFLGGSM 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 

SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 

VLPbu 1 Qi>A Yr LDLTTORGRVGHGVPDGIPDVV 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 

LAMAMKLEAVLRALK 


3642 


A 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPEICPRKH 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

KbKbQKAKQER 


3643 


A 


94 


541 . 


RKERRRRRRRMEAWFVFSLLDCCALIFLSVYFn 

TLSDLECDYINARSCCSKLNKWVIPELIGHTIVTV 

LLLMSLHWFIFLLNLPVATWNIYRYIMVPSGNM 

GVFDPTEIHNRGQLKSHMKEAMIKLGFHLLCFF 

MYLYSMILALIND 


3644 


A 


95 


2808 


TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 
. FCSRDPVPEYLNHCGVKYVLISDRASFCALfflFFS 
PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEMEIPK 
LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 
VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 
LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 
TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 
PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 
QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 
CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 
APAPEEEAEGPAAALGPRGPLGSGPGWLYLCPE 
ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 
GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

TQAKLGAHQRSHFEPERPYQCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IHLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 



404 



wo 01/57190 



PCTAJS01/04G98 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide ' 
sequence 


Amino acid sequence (A=AlaDine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Hisddine, 
I^Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, Mpossible nucleotide deletion, 
\=70ssible nucleotide insertion 










TWKSRCPISSC]>m.FTSKHS]VIKTHMVKRHKVGQ 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSPFFGQEGETQFGFPNAA 

G>JHGSQKERNLITVTGSSFLV 


3645 


A 


2194 


1707 


TVSFHKTMASLKCSTWCVICLEKPKYRCPACRV 

PYCSVVCFRKPDCEQCNPETRPVEKmSALPTKT 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATLRSLLLNPHLRQLMVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 


3646 


A .. . 


85 


1948 


ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKLKWNERATLFRITSNAMINACRDFLELAEIHS 

RKWQRALQYEQEQRVHLEETIEQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSFITVITEAKEDSRKAEGSTGTSSA 

DWSSADNVLDGASLVPKGSSKVKRRVRIPNKPN 

YSLNLWSIMKNCIGRELSRIPMPVNFNEPLSMLQ 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHRIAKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSKF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NIIVGKLWIDQSGDIEIVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGWSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 


3647 


A. 


46 


5007 


PTGDACVSTSCELASALSHLDASHLTENLPKAAS 

ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 

VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 

SPVTDIDSFIKELDASAARSPSSQTGDSGSQEGSA 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLDSINPDKHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQIEICSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPULSSPNMV 

NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 

KNCrcb VJUbNLHlbbsQDLDDLLQKPKMIARRPIM 

AWFKEINKHNQGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRKPLISPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQCVLESKPPLAT 



405 



wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C^Cysteine, D^Aspartic Acid, 
E^lutamic Acid, F^Pbcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr^Lfudne, M=Methionine, 
N=AsparagiBe, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y-=Tyrosine, 
X=Unluion'n, *=Stop codon, A^ossible nudeotide deletion, 
^possible nudeotide insertion 










SGPLKPSVSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLKIVEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSWPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQRIKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASPVKKNKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

EDVCFIVLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALWIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KGGAAEQAGIIEAGDEILAINGKPLVGLMHFDA 

WNIMKSVPEGPVQLLIRKHRNSS 


3648 


A 


337 


1564 


KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKGYLTFSDSGDKVAVEWDKDHGVLESHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

GSKAVPWVILSDGDGTVEKGFKAEWLAVKDER 

LYVGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPRRASQERYSEKDDERKGANLL 

LSASPDFGDIAVSHVGAWPTHGFSSFKFIPNTDD 

QHVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 


3649 


A 


1 


775 , 


PTRPGSGSAGGARVGSQEFGVEMAALAPLPPLPA 

QFKSIQHHLRTAQEHDKRDPWAYYCRLYAMQ 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKNMIKSFYTASLLIDVITVFGELTDENVKHRKY 

ARWKATYIHNCLKNGETPQAGPVGIEEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 


3650 


A 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

\\rnTaDRMQPGPWGhnvCDKFVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTARTO^HYDTFLVmYVKRHLTlMMDIDGK 

HEWRDCIEVPGVRLPRGYYFGTSSITGDLSDNHD 

VISLKLFELTVERTPEEEKLHRDVFLPSVDNMKL 
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wo 01/57190 



PCTAJSO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

oucJeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=FhenylaIanine, G=Glydne, H^Histidine, 
I^lsoleucine, K=Lysine, L^Leudne, M=M etbionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
1— inreonine, v— valine, w— iryptopnan, Y= tyrosine, 
X=Unknown, *=Stop codon, ^=possible nucleotide deletion, 
V=possible nudeotide insertion 










PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGIILY 
NKWQEQSRKRFY 


3651 


A 


1 


1218 


RSWAYVKKCKNNMCPNRGLHDGPEPCWLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQMQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIILILISFTCRFLLNSRVTDAAFNFLLVW 

YYCTLTIRESILINNGSRIKGWWVFHHYVSTFLSG 

VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLFLGNFFTTLRVVHHKFHSQ 

RHGSKKD 


3652 


A 


640 


164 


VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQPM 

MQTIGQKYCMDPAVIAGVLSRKSPGDKILVNMG 

DRTSMVQDPGSQAPTSWISESQVFQTTEVLtTRI 

TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 

QDLSCDFCNDVLARAKYLKRHGF 


3653 


A 


2 


909 


IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEWLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCUVSMNTADPGSQGJTHSLLLQVIDDKGSILPP 

NTEGNIGIRIKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEWK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3654 


A 


2 


909 


rVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIRIKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFTVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 


3655 


A 


2 , 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDHNNPLFIMDGISPTDICQGILGDC 

WLLAAIGSLTTCPBCLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNVWDDRLPTKNDBCLVFVHST 

ERSEFWSALLEKAYAKLSGSYEALSGGSTMEGL 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVR>fPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

Fn.LElCNLTPDTLSGDYKS Y WHTTF YEG S WRTG 

SSAGGCRNHPGTFWTNPQFKISLPEGDDPEDDAE 

GNVWCTCLVALMQKNWRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEEF 

TNSREVSSQLRLPPGEYniPSTFEPHRDADFLLRV 

FTEKHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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wo 01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide . 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F^Phenylalanine, G=Glycine, B=Histidine, 
I=Isolencine, K=Lysine, I^Lencine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-=Tlireonine, V=ValiDe, W=Tryptophan, y=Tyrosine, 
X=llnkDotvn, *=^top codon, A=possible nudeotide deletion, 
V^possible nucleotide insertion 










DFLHLFKTVAGEGKEIGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDXFRECDQDHSGTLNSYEMRLVEE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LRLKTMFTFFLTMDPKNTGHICLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LPVSTVAADDHVGIEAL 


3656 


A 


3 


174 


PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 


3657 


A 


1 


444 


DTRSTYHNAHSLPTYVKSPAPCQMTYIKSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCGENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GnPMKSRSPALL 


3658 


A 


92 


1537 


SEAPVQPQEYTMTSFYSTSSCPLGCTMAPGARNV 

FVSPIDVGCQPVABANAASMCLLANVAHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCHIPGNIGIC 

GAYGKNTLNGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKLLCSKAENARLIVQIDNAKLAADDFRIKL 

ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVOLRSQLGEKFRIEL 

DIEPTIDLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEILELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLENEIATYRNLTPLQSLFHACLLYFLSKLWPC 

HRWVSLWPWSQHGEMILKARVRRLRLVALGSG 

VPSPCPVFLQD 


3659 


A 


2 


402 


DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 


3660 


A 


26 


710 


CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKNGGLLLSTNMKWVQFSNLHVDVPKD 

LTKPWTISDEPDILYKRLSVLVKGHDKAYLDSY 

EYFAVLAAKELGISIKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFLDTIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 


3661 


A 


2 


370 


DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 


3662 


A 


205 


1277 


RKSLPHPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

AWLQKLSKHKTPAQPQLKAANCCEEVKELKAQ 

VANLSSLLSELNKKQERDWVSVVMQVMELESN 

SKRMESRLTDAESKYSEMNNQIDIMQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEWCDMETSGGGwTIIQRRKSGLVSFYRDW 

KQYKQGFGSIRGDFWLGNEHIHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGNVGNDALQYHNNTAFSTKDKDNDNCLDKCA 

QLRKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHGSTYSLKRVEMKIRPEDFKP 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
•NO: 


Method 


Predicted 
beginning 
nucleotide - 
location 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=°Fbenylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, l^Leucine, M=Metliionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UnliDown, *=Stop codon, A=possible nncleotide deletion, 
\=possible nucleotide insertion 


3663 


A 


64 


1456 


LSSAKETI^QMYNTVWNMEDLDLEYAKTDINC 

GTDLMFiTEMDPPALPPKPPKPTTVANNGMNNN 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTKMHGDYTLTLRKGGNNKLIKIFHRDGKY 

GFSDPLTFSSWELINHYKNESLAQYNPKLDVKL 

LYPVSKYQQDQWKEDNIEAVGKKLHEYNTQFQ 

EKSREYDRLYEEYTRTSQEIQMKRTAIEAFNETIK 

DFEEQCQTQERYSKEYIEKFKREGNEKEIQRIMHN 

YDKLKSRISEIIDSRRRLEEDLKKQAAEYREIDKR 

MNSnCPDLIQLRKTRDQYLMWLTQKGVRQKKL 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSSh!RNKAENLLRGKRDGTFLVRESSKQGCYAC 

SVWDGEVKHCVINKTATGYGFAEPYNLYSSLK 

ELVLHYQHTSLVQHNDSLNVTLAYPVYAQQRR 


3664 


A 


944 


406 


GATVEDQSCNFGSLRWWSVPHISARSCPDPLLS 

RTGRVPGGRGAGLPRHHSPRCCLQVFFNGANVR 

QVDVPTLTGAFGBLAAHVPTLQVLRPGLVWHA 

EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 

MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 

lEANEALVKALE 


3665 


A 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

GYETITARRQWKHIYDELGGNPGSTSAATCTRR 

HYERLDLPYERFIKGEEDKPLPPIKPRKQENSSQE 

NENKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 

QEFSABCPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 


3666 


A 


113 


1492 


LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYPGI 

KARTTQRALDYGVQAGMKMIEQMLKEKKLPDL 

SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 

VPGVGKALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKNLNEMLCPIIASEVKALNANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

PLENLTDPPFSPVPFVLPBRSNSMLYIGIAEYFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSRIAEIYILSQPFMVRIMATEPPII>fLQPGNFTLDI 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLVIL 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFENILSS 

ILHFGVLPLANAKLQQGFPLPMPHKFLFVNSDIEV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 


3667 


A . 


1 


181 


FRGRLGSGRNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKVPNACLFTINKEDHTLGNIIK 


J\J\JO 


A 






A/ A /^T7 A\/DU'CT>A>nV/rVCm!>T VDCVT AT \n T\/VT?T T HT/^ 

V AuJiA VrrrrMM Y bEJrL.JSjo Y L.A1_ V L W Y rLL 1 (jt 

YCITKPEVIFKIEQGEEPWILEKGFPSQCHPAKYL 
WCLHD 


3669 


A 


458 


1056 


FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISCIRTVWRTEGLGAFYRSYTTQLT 
MNIPFQSIHFITYEFLQEQVNPHRTYNPQSHnSGG 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first Amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AlaDide C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Phenylalamne, G=Glydne, H=Histidine, 
I=IsDleudne, K=Lysine, Lr^Lcudne, M^Methionioe, 
N=AsparagiDe, P=Proline, Q==Glutamine, R=Argioine, S=Serine, 
T=Threonine, V=Valine, 'W=Tryptophan, Y=Tyrosine, 
X=Unknown, *==Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










LAGALAAAATTPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKGIQARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 


3670 


A 


145 


298 


RNPCPLTFLPSTLMVLLLSLTFFSALTFHSICQLKN 
TGVEVDIVFQRVSFL 


3671 


A 


3 


462 


ILKVAKKERTMSSLPVPYKLPVSLSVGSCVIIKGT 

PIHSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 

NHVVMNRREFGIWMLEETTDYVPFEDGKQFELC 

lYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC 

KCPRRYLPWTSVCVCN 


3672 


A 


1 


1028 


HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLRTELGSFFTEYLQNQLLTKGM 

VILRDKIRFYEGQKLLDSLAETWDFFFSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNAITLSVKLED 

ALARAHARVPPATVQMLLVLQGVHESRGVTEDY 

LRLETXVQKWSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 


3673 


A 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREEKNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3674 


A 


2 . 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYnYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKTVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EIOBEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 


3675 , 


A 


921 


1321 


VTLABCMRVfflSSCLKVQEQMANCPKFVPWPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 


3676 


A 


3 


1856 


TLGRWLLGVYETVAPTLACLPRPRLRRREJRRRR 

RRMISRYTRKAVPQSLELKGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAIDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLRBLGRQnTPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKLGYPPIAPFYCMKEDVLAYVFD 

SVWCKWSCMEQLTRSHWEGFASDDESNVAVT 

RPDSESSCVLSELHPLVLPRVPQSKVLYTTSNPMS 

LCQASRHQPNVNDLLVHGMPLQPKNLSLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRKNPPPRTLHPISTSHSCAETPRSVEEDL 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

Dudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nndeotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=GIutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I'=IsoIeudne, K=Lysine, I>=Leudnc, M=Methionine,- 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, •Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, A=possible nudeotide deletion, 
^possible nudeotide insertion 










RGARVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAWDEPNYQQ 

PQERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P 


3677 


A 


246 


757- 


MRLQGAIFVLLPHLGPILVWLFrRDHMSGWCEG 

PRMLSWCPFYKVLLLVQTAIYSWGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGLALLHLLLLYGLWSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 


3678 


A 


20 


1508 


RGKAEFFLAMAGTNALLMLENFIDGKFLPCSSYI 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 

CTQMDHLGCMHYTVRAPVGVAGLISPWhfLPLY 

LLTWKIAPAMAAGNTVIAKPSELTSVTAWMLCK 

LLDKAGVPPGWNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERITQLSAPHCKKLSLELGGKNPAD 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 

SIYSEFLKRFVEATRKWKVGEPSDPLVSIGALISK 

AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDKDESCCMTEEIFGPVTCV 

VPFDSEEEVIERANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEIKTITVKH 


3679 

■1 


A 


1862 


502 


MAGTKPYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEIEAIINSLP 

TKKIPGPDRFTAKFYQRYKEELSNLIHYLGLSHH 

LLALNFDVSFGKKSAWSSAQVKVTDTDFDGVEV 

RVFEGPPKPEEPLKRSVVYIHGGGWALASAKIRY 

YDELCTAMAEELNAVIVSIEYRLVPKVYFPEQIH 

DVVRATKYFLKPEVLQKYMVDPGRICISGDSAG 

GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

LDFNTPSYQQNVNTPILPRYVMVKYWVDYFKG 

NYDFVQAMIVNNHTSLDVEEAAAVRARLNWTS 

LLPASFTKNYKPWQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 


3680 


A 


249 


2146 


RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK . 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVnVFHNEAWS 

TLLRTVYSVLHTTPAILLJCEIILVDDASTEEHLKE 

KLEQYVKQLQVVRWRQEERKGLITARLLGASV 

AQAEVLTFLDAHCECFHGWLEPLLARIAEDKTV 

V VbPiJlV IIDLN IrEFAlCrVQRGRVHSRGNFDWS 

LTFGWETLPPHEKQRRKDETYPIKSPTFAGGLFSI 

SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEIIPCSWGHVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKJFYRRNLQAAKMAQEKSFG 

DISERLQLREQLHCHNFSWYLHNVYPEMFVPDL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Ttrst amino 

acid residue o{ 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidlne, 
I=Isalencine, K=Lysioe, L=Leucine, M=^ethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
V=possible nndeotide insertion 










TPTFYGAKNLGTNQCLDVGENNRGGKPLIMyS 
CHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 


3681 


A 


2982 


1869 


LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEWQIRSEVSQVKREKENIQTLL 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKJEIQSQQLEALQQQVKQLQNQLAECKKQHQE 

VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 


3682 


A 


447 


1024 


AQALTAGRQLALAAPFIAPISPISLPRLNPPSQSW 

NSTPFFKVKLPPQKEVITSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSY1SIQI7VGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 


3683 


A 


2 


942 • 


LEIKQEEKFVGQCIKEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPPINHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKRPISNKYM YFMKNRARRQGD^KLLPNG 

FTKRKENSTFFDKKKQQFCWHVKLQFPQSQA\ST 

♦KKRVPDDKTINEILKPYIDPEKSDPVIRQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 


3684 


A . 


119 


1533 


SLQENVQEKRVRVCPGLGGLLPNGTPSITAAAAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAW 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDLTPVPDGPHDCPRDVQGEPGAGGGSQLAPC 

CPPFPAAPVSVQGTQGLGPKNVLH*QWEGIRWQ 

KEPE/PGPPPEVELKRGAKCRIGDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTP/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

YSLP*LCRECCSCP/PPPPAHGGRCPSLLPPEALAK 

LLL 


3685 . 


A 


101 


438 


AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFNHTHICVTNLEYhaKEYPWDLV 
KAHLQGASTSNITFDIGELQKKuLDLNKQTQEFQ 
PSL*AWTEFQQGLE 


3686 


A 


105 


845 


VSDWKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWTLFLShfYQPSVESSSPGGSATSDDHE 
FDPSADMLVHDFDDERTLEEEEMMEGETNFSSEI 
EDLAREGDMPIHELLSLYGYGSTVRLPEEDEEEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nndeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GI)'cine, H=Histidine, 
I-=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Pn)line, Q=Glotamine, R=Arginine, S=Scrine, 
T=Threonlne, V=Vflline, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Sfop codon, A^possible nucleotide deletion, 
\==possibIe nucleotide insertion 










EEEEEGEDDEDADNDDNSGCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEIIRPRRCKYF 
DTNSEVEEESEEDEDYIP/SnSFFQSSDGPSSSSSE 
DWKKEIMVGS 


3687 


A 


49 


1225 


PVLVTSLRMREADTLRPPQLMEVSADnSTVEFN 

HTGELLATGDKGGRWIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEKINKIKWLPQQ 

NAAHSLLSTNDKHKLWKITERDKRPEGYNLKDE 

EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHINSISVNSDCETYMSADDLRINLWHLAI 

TDKSFTP\NIVDIKPA>ny[EDLTEVITASEFHPHHC 

NLFVYSSSKGSLRLCDMRAAALCDBCHSKLFEEPE 

DPSNRSFFSEnSVSVSDVKFSHSDRYMLTRXDYLT 

VKVWDLNMEARPIETYQVHDYLRSKLCSL YEND 

CIFDKJFECAWNGSDR/TIMTQAYNNFFRMFDRNT 

KRDVTLEASRGSSKPRAVL 




A 


1 


401 


KKWGRLSEMSFSLNFTLPANTTSSPVT\DCGPSL 

A A /^TTQT T \ f A T* ATTTTATT T?* I'l TTTr> T1T» C»OT1~» A n rr^T^ 

(jJLAAGlPLL VA 1 ALLVALLi' ILIHRRRSSIEAMEE 
oJLJiU'v^lllbtllJUiJJNJrJVlbbrlrKKar InJiKN IJMGAQh. 
AHIYVKTVAGSEEPVHDRYRPTIEMERRR 


3689 


A 


698 


889 


GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 


3690 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3691 


A 


61 


153 


MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 


3692 


A 


J 


2831 


PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 

DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 

KSVIIWSKLEGILKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 

QYLALGMFNGIISIRNKNGEEKVKIERPGGSLSPI 

WSICWNPSSRWESFWMNRENEDAEDVIVNRYIQ 

EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 

EERNDILAVADWGVQKVSFYQLSGKQIGKDRAL 

NFDPCCISYFTKGEYILLGGSDKQVSLFTKDGVR 

LGTVGEQNSWVWTGQAKPDSNYVVGGCQDGTI 

SFYQLIFSTVHGLYKDRYAYRDSMTDVrVQHLIT 

EQKVRKCKELVKKIAmCNRLAIQLPEKILIYELY 

SEDLSDMHYRVKEKIIKKFECNLLWCANHIILC 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGLKNGQILKIFVDNLFAIVLLKQATAV 

RCLDMSASRKKLAWDENDTCLVYDIDTKELLF 

QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

PVHRQKLQGFVVGYNGSKIFCLHVFSISAVEVPQ 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKFHEAAKLYKRSGHENLALEMYTDLCMFE 

YAKDFLGSGDPKETKMLITKQADWARNIKEPKA 

AVtMYISAGEHVKAIEICGDHGWVDMLIDIARK 

LDKAEREPLLLCATYLKKLDSPGYAAETYLKMG 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDIA 

QDPAQKD 


3693 


A 


3 


1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTfflG 
GWRHHRDHTAJODEWDFNPSKFLIYTCLLLFSVLL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
. add residue or 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, INAspartic Acid, 
E>==Glutaniic Acid, F^Phenylalanine, G=Glycine, H-Histidine, 
I=koleucine, K=Lysine, L^=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=V)nknown, *=Stop codon, /=possible nucleotide deletion, 
V^'possible nucleotide insertion 










PLRLDGnQWSYWAVFAPIWLWKLLWAGASVG 

AGVWARNPRYRTEGEACVEFKAMLIAVGIHLLL 

LMFEVLVCDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEELCSVNILQFIFIALKLDRI 

mWPWLWFVPLWILMSFLCLWLYYlVWSLLFL 

RSLDWAEQRRTHVTMAISWmWPLLTFEVLL 

VHRLDGHhTITSYVSIFWLWLSLLTLMATTFRRK 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 


3694 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3695 


A 


483 


761 


PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVENIKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 


3696 


A 


456 


733 


LSAALWEEPILSLWSETKELTNRGKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
H\GGEGQGPGKRAGHLGRGGGMSFL 


3697 


A 


877 


1873 


VWL*TLS*HTCALMTVCRSCLVKYLEENNTCPT 
CRIVEHQSHPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHICLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADDSSNKEAAE 


3698 


A 


1 


572 


KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 

LRRDNPRF>JLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGNVWL\KSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 


3699 


A 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

PTPWRSRQSGKQSERAS*LKGRGRYGLGALGGR 

GGRALGGSRWPPPLPGETLFSGCBCHRRRRRGSD 

AAPGEEAGT 


3700 


A 


33 


1318 


GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYEFHHRCRLLEGVKQALWLTKTKL 

ffiGLPEKVLSLVDDPRNHIENQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARRICVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPLPTIASREEIEATKNHVLETFYPISPnDLHECN 

lYDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKMILFAFGSALAQARLLYGNDAKV 

LEQPWVQSVGTDGRVFHFLVFQLNTTDLDSNE 

GVKNLAWVDSDQLLYQHFWCLPVKKRVWEP 

VGPVGFKPETFRKFLALYLHGAA 


3701 


A 


86 


465 


WTLCGPEAGMVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
GAVQFLSFEMLTCLVHRGSVYDARE 


3702 


A 


166 


814 


GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEBFQEDTVRSPFLYNKDVNGK 
WLWKGDVALLNCTAFVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQID 
NO: 


Method 


Predicted 
beginoing 
nudeoiide 
location 

f n t* t*ff c rm n f1 1 n o 
1 C3|/UUIJIIi{^ 

to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

tt\ inQt tttninn 

lA/ wSl dllllUU 

acid residue or 

peptide 

sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PheDylalanine, G=GlyciDe, H^Histidine, 
I==Isoleucine, K=Lysine, I;=Leucine, M=Methionioe, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
1 — 1 iircuiiinC) V — v^iiiiCf — a ty^MwyntdUj « lyrosilic, 
X=Unknown, ♦=Stop codon, A=possiblc nucleotide deletion, 
V=possible nucleotide insertion 










KEQSMSSVGFCVINSAKRGYPLKDATHIALRTVR 
RFLEffiGETEEKW 


3703 


A 


128 


.1255 


SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KJEVEASKXSYHAARKDEKTAQTRESHAKADSA 

VSQEQLRKLQERVERC A KE AEKTKAQ YEQTLAE 

LHRYTPRYMEDMEQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKJ'HELHRDLHQGIEAASDE 

rJJLKW WRb I HGPOMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 


3704 


A 


1 


271 


ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 


3705 


A ' 


170 


1318 


LNWANLVIMWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

WCRESRSHKQHSVLPLEEVVQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 


3706 


A 


204 


1996 


SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTJCSNRMLATEKPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGIPRPRGRLQRANTTVNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALDIKLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEGIAMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

MGTAVGSWPVTPDPATGKTTLGSVNNLTISDV 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 

ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 

PMSINSOT.LGYIGIDTIffiQMRKKTMKTGFDFNIM 

WGTEGCGAAAGLVAGSTKDPISFPQ 


J/U/ 


A 

A 


3 


549 


SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ 
DAEQPMRYETLFQALDKNGDGVVDIGELQEGLR 

LKDHEKKMKLAFKSLDBCNNDGKIEASErVQSLQ 
TLGLTISEQQAELILQSroVDGTMTVDWNEWRD 
YFLFNPVTDIEEIIR 


3708 


A 


1 


1866 


EFRGAGRANMLAPRGAAVLLLHLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, - 
E=Clutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isaleucine, K=Lyslne, L^Leudne, M=MethioQiDe, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^possible nucleotide deletion, 
\p^ssible nucleotide insertion 










LWISTFKLQTKSSATEFGLYSSTDNSKYFEFTVM 

GRLSKAILRYIiCNDGKVHLVVFNNLQLADGRRH 

RILLRLS1>JLQRGAGSLELYLDCIQVDSVHNLPRA 

FAGPSQKPETIELRTFQRKPQDFLEELKLWRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTWPPASPAPPTRPPRRCDSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCINLSPGFRCDACPVGFTGPMVQ 

GVGISFAKShfKQVCTDEDECRNGACVPNSICVNT 

LGSYRCGPCKPGYTGDQIRGCKAERNCRNPELN 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGDLNEQDNCVLIHNV 

DQRNSDKDffGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNDLDNCPKFPNRDQRDK 

DGEX3VGDACDSCPDVSNPNQ 


3709 


A 


144 


417 


TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYIY 


3710 


A 


245 


688 . 


FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRNLSVADHSKTQVQKKENKSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCPICKE 

EFELRPQVFSIRG 


3711 


A 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 
TPAMMNGQGSTTSSSKNIAYNCCWDQCQACFNS 
SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQSWLQRHMLTHSGDKPFKCVVGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKRRKLKNKRRRSLARPHDFFDAQTLDAJR 
HRAICnSfLSAHIESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNISXnSKSLKI 


3712 


A 


2 


344 . 


RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRIEEACEMYTRAANIVIFKMAKNWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 


3713 


A 


20 


974 


GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

SVSLSPRRSRLLVLRCGLRRNPERPSSSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHWSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVMNVTSPE 

FTSVQHGSRALATKDMRKSQERSMSYCDESRLS 

YLLRRJTRENDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVHDVLNER 


3714 


A 


237 


458 


IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 
QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 

JUlOrlKL. 


3715 


A 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAWLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

acid residoe of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Metliiooine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, - 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^ssible nucleotide deletion, 
V=^ossible nucleotide insertioo 










PRSTALRSPGLSPLLH 


3716 


A 


85 


308 


QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 

VPLISPLDISQLQPPLPDQWKTQTEYQLSSPDQQ 

NYTKSR 


3717 


A 


58 


618 


GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEWD 

SNPYSRLMALKRMGIVSDYEKIRTFAVAIVGVGG 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 


3718 


A 


3 


593 


RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRMYDKYGSLG 

LYVAEQFGEENVNTYFVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 


3719 


A 


2 


2173 


SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RMYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSVVDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSWG 

SDFEPWTNKRGEILARYTTTEKLSINLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNREEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEASILKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLWQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGMKECDESGFPKHLLFRSLGLNLALAD 

PPESDRLQILNEAWKVITKLKNPQDYINCAEVWV 

EYTCKIBTXREVNTVLADVIKHMTPDRAFEDSY 

PQLQLIKKVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKCI\RTPLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMLSYLINGFIKMVSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETRKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 


3720 


A 


24 


296 


ENLFRAGFAFSLLRSSFYTSKTYCSWFSNLISGSL 

ADFNSKGTRDYSPRQMAVRE/KVFDVIIRCFKRH 

GAEVIDTPVFELKVRNGQEETTW 


3721 


A 


2 


310 


PSCLTCVGHCSIGGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 
K 


3722 


A 


75 


722 


MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 

VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 

YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 

EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 

tjJ<J-.ALbVOIUJs.iJLKl wNLvbOKoArlisJNJLIs.(jNA 

HIVEWSPRGEQYVVnQNKIDIYQLDTASISGTITN 

EKRISSVKFLSES 


3723 


A 


110 


316 


MELSDNRRSGGLEGLAEKCPNLTYLNLSGNKIK 
DLSTVEALVSGTVLSLDLLFLVKFSEICLCLLISI 


3724 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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wo 01/57190 



PCTAJSO 1/04098 



SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residoe of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>=Aspartic Acid, 
E=Glntamic Acid, P=PlienylaIanine, G=GIydne, H^Histidine, 
I=IsoIeudne, K=Lysine, D=Leudne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Argiiune, S=Serine, 
T=Threonine, V=Valine, W=To'ptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nudeotide insertion 










VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREBCEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DG 


3725 


A 


3 


406 


VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 

DG 


3726 


A 


1 


433 


SSDDRSLFKRLKLNYAIFDEGHMLKNMGSIRYQ 

HLMTINANNRLLLTGTPVQNNLLELMSLLNFVM 

PHMFSSSTSEIRRMFSSKTKSADEQSIYEKERIAH 

AKQHKPFILRRVKEEVLKQLPPKKDRIELCAMSE 

KQEQLYLG 


3727 


A 


6 


383 


RIPRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AFNEKGQLRHIKTGEPFVFNYREHLHRWNQKRY 
EALGEHTKYVYELLEKDCNSKKYS 


3728 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEYLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQ VAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLTi 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKXIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKWTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 


3729 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

NKJEDLRSECSSDFGGKDSVTSPDMDEITODFLYl 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAWRPKVHYARPSHPPPD 

rrlLbOA VCKjNbARLrNr Gar Mr *LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPAVWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 
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wo 01/57190 PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Drst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
£=Glutaniic Acid, F=PhenylaIanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L=Leudne, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, y=Tyrosine, 
X=Unl;nown, *=Stop codon,A=possibIe nucleotide deletion, 
V^ossible nucleotide insertion 










EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTONGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQITQAHLERLLQRVLR. 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEfflQRLSKWTANHRALQBPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCmRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK . 


3730 


A 


3 


2452 


EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYl 

LQPKQHFQfflEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD . 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/yTPERLVRSRSS\DlVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRJDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

EiKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFDCTIDD 

RK 


3731 


A 


1 


1305 


VNTAMHEAKLMEECDELVEnQQRKQMIAVKIK 

ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 

ILJCENDQARFLQSAKNIAERVAMATASSQVLIPDI 

NFNDAFENFALDFSREBCKLLEGLDYLTAPNPPSm 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANFISLYNSVDSWMTVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSKNSEPTRLKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVFSRCNSNFWRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSLVH 

LHTTDVTFuLrVCPTFTiWNKSLMILSGL^ 

DYPERQECNCRPQESPYVSGMKTCH 


3732 


A 


127 


2832 


LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

begioning 

nucleotide 

location 

corresponding 

to first amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residne of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glntamic Acid, F=Phenylalanine, G=Glycine, H==Histidine, 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Pr<rfine, Q=Clutamine, R=Arginine, S^erine, 
T=nireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACLWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTONLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEANMDSGTETKKILILPWKLRA 

QKDVDSDRVKQEPRFEEEVnGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEAIFGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKNLELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREIREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KIRDPFIHEISKIAMGMRSASQFTRDFIRDSGWS 

LIETLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

TFICQVCEETLAHSVDSLEQLTGNKGCFRHLTMT 

IDYHT\LIAN*YGPGFPLLF*PQAQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNISNIIKSGKMSLroDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 


3733 


A 


2 


3274 


DVPLIRIEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAELPDEEFRLVKIRFLIEDIMDNAPLFPAT 

YINISIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LKSQNIFGLDVETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAELQVSVTDTNDN 

HPVFKETEDSVSIPENAPVGTSVTQLHATDADIGE 

NAKIHFSFSNLVSNIARRLFHLNATTGLITIKEPLD 

REETPNHKLLVLASDGGLMPARAMVLVNVTDV 

NDNVPSIDIRYrVNPVNDTVVLSENIPLNTKlALIT 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA\ADAGKPPLNQSAM 

LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDREKEDKYLFTILAKDNGVPPLTS 

NVTVFVSriDQNDNSPVFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 

TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGtWFQVIAVDNDTGMNAEVRYSrVGGNTRDL 

FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATLINELVPQICH 

LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 

VWIFITAVVRCRQAPHLKAAQKNMQNSEWATP 

NPENRQMMMKKJKKKKKKHSPKNLLLNVVTIEE 

TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

LK±LHilQbLPLDNTFVACDSISNCSSSSSDPYSVSD 

CGYPVTTFEVPVSVHTRPPVDLEVGGAQSGQVAI 

LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 

EQEAKILLVGYWGDGEWCHFHFHHLIPGPVNPG 

YERKQYHILDSDSEDTQPSGELCPIPVRPFTE.SIQ 

LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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wo 01/57190 



PCT/USOl/04698 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

correspODding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^'Alanine C=Cysteine, D=Aspartic Acid, 
£=Glu(amic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, M^Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=TyrosiDe, 
X=Unknown, *=Stop codon, A=passible nucleotide deletion, 
V=possible nucleotide insertion 


3734 


A 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF*GESLyEM 

QLITSLGLQEFDIARNVLELIYAQTLVWIGIFFCPL 

LPnQMINn.FIMFYSKNISLMMNFQPPSKAWRAS 

Q]vlMl>i>'lJ:''LLFFPSFTGVLCTLAITIWRLKP^ 

GPFRGLPLFIHSIYSWIDTLSTRPGyLWWWIYRN 

LIGS\OTFFILTLIVLnTYLYWQITEGRKIMIRLLH 

EQUNEGKDKMFLIEKLIKLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 


3735 


A 


2 


432 


VEVCRRYLWKMTVDASQNVQCCVIFSHFPFIFN 

NLSKDCLLHTDTLLKIESKKHKAYLRSAAIEEERE 

SEFALRPTFDLTVRKNHLIEDVLNQLSQFENEDL 

RKELWVSFSGEIGYDLGGSATCKEIFYCLFAEMIQ 

PEYGMFMY 


3736 


A 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKADKV 

TMLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAWQLPKNGPIYDWWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLILQ1*AD/IMKVWNVKNYKLI 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

GYKIWHYTGSILHKYDVPSNAELWQVSWQPFLD 

GIFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKNLKKKLB^IEQLKEQAA 

TGKQLEKNQLEFaQKETALLQELEDLELGI 


3737 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 

EILPEPGSETPTYASEALAELLHGALLRRGPEMG 

YLPGPPLGPEGGEEETTTTllTTTTVTTTVTSPVLC 

NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSmV 

YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGWLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGG>}LAL 

ATT T PT frT VTVT RSjnvVTVVTIfl nnk'QT PflPQnCR 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3738 


A 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTnri'riVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
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wo 01/57190 ^ PCTAJSO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
. corresponding 
to last amino 
add residue of 
peptide , 
sequence 


Amino acid sequence (A'^AIanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Add, F=Phenylalanine, G=Glydne, H=Histidine, 
I^Isoleudne, K=Lysine, Ir=I^uune, M^'Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=SeriDe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, ^^ossible nudeotide deletion, 
V=possible nudeotide insertion 










YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTJHNATLGRIVSPEPGGAVGFNLTCR 

WVIEAAEGRRLHLHFERVSLDED>fDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATESCLPGYALEPPGPPNAECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HRTASDAGFPVGSHVQYRCLPG YSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVTITCV 

PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNL/U. 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 


3739 


A 


734 


445 


LLEPEPAEEYTEQSEVEST/EGMILPCCLYFAAFQ 
TNVSNFYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 


3740 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFELDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKVKKRIQLSPKXIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRILTIDG/* 

PQIAVTLNGVDKJLLFTTTSVINGSQVVTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL- 

PGYKGEPGRDGDK 


,3741 


A 


5048 


1236 


MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYS/U.yTVPT 

QNVTPNTVNQQPGAQQLYSRGPP/y>HIVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDM.IRNHTGSLAVANNNPTITVADSLSCPVM 

QN VQPPKSbP Vv STYLSGSSGSSSTRTPPTANHPV 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 

PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 

YDALEGGSYPDMLSSSASSPAPDPAPEPDPASAP 

APAS/y>APWPQPSKMAKPLAMAIQHFSLVIRML 

QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Orst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-^Pbenylalanine, G=Glycine, B^Histidine, 
I=Isoleucine, K=Lysine, D=Leucine, M=Metbionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, 'W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 

VPNLNADLKKLNCSPDSFRCTLTNIPQTQALLNK 

AKLPLGLLLHPFRDLTQLPVITSNTIVRCRSCRTYI 

NP\FVSFIDQRR*KCNLCYRVNDVPEEFMYNPLT 

RSYGEPHKRPEVQNS\TVEFIASSDYMLRPPQPAV 

YLFVLDVSHNAVEAGYLTI/LWCQSLLEXNLDKLP 

GVDSRT\RIGFMTFD\STYSFLQFTQEGLSQPQMLI 

VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 

MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 

RYLTRKIGFEAVMRIRCTKGLSMHTFHGNFFVRS 

TDLLSLANINPDAGFAVQLSIEESLTDTSLVCFQr 

ALLYTSSKGERRIRVHTLCLPWSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAW 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLYHLM 

KMIHPNLYRIDRLTDEGAVHVNDRTVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 

DVLGYTNFASEPQKMTHLPELDTLSSERARSFIT 

WLRDSRPLSPILfflVKDESPAKAEFFQHLIEDRTE 

AAFSYYEFLLHVQQQICK 


3742 


A 


934 


68 


SMLASQGVLLHPYGVPMTVPAAPYLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNGEPAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

SAQTVSGTRNKQD*RSTDGWPSPKTQTS*KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VEIIFNERGSKGFGFVTFENSADADRAREKVLHGT 

VV\EGRKI\EVN\NATARVMTNKKTVNPYTNGWK 

LNPVVGAVYSPEFYAGTVLLCQANQEGSSMYSA 

PSTDFRGAKLHTSRPLLSGS 


3743 . 


A 


3 


1456 


QFQQAWMQNKVPIPAPNEVLNDRKEDIKLEEKK 

KTQAEBEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGS\KGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 

HMGPQGPPGPQGfflGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 


3744 


A 


1571 


652 


PLTGRKCPGWTHSGSRRSPRIAEEVPGFPKRAEA 

SRQFSETADRLELLRRAVMAAARATTPADGEEP 

APEAEALAAARERSSRFLSGLELVKQGAEARVFR 

GRFQGI^VIKHRFPKGYRHPALEARLGRRRTV 

QEARALLRCRRAGISAPWFFVDYASNCLYMEEI 

EGSVTVRDMFSPLWRLKKTPQGLSNLAKTIGQVL 

ARMHDEDLEHGDLTTSNMLLKPPLEQLNIVLIDF 

GLSFISALPEDKGVDLYVLEKAFLSTHPNTETVFE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine C=Cysteine, I>=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I^Isoleudne, K=Lysine, I^Leucine, M=MethioQine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
I— inrconinc, v=vaiinc, w— i ryptopnan, y=lyrosine, 
X=UnknowD, *=Stop codon, possible nucleotide deletion, 
V=possibIe nucleotide insertion 










AFLKSYSTSSKKARPVLKKLDEVRLRGKKRSNfV 

G . 


3745 


A 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 
LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 
QDRGLWTDLKAESWLEHRSYCSAKARDRHFA 
GDVLGYVTPWNSHGYDVTKVFGSKFTQISPVWL 
QLKRRGREMFEVTGLHDVDQGWMRAVRKHAK 
-GL\P*CLGSCLRTGLTMISGAA^LDSEDEIEELSKT 
WQVAKNQHFDGFWEVWNQLLSQKRVGLIHM 
LTHLAEALHQARLLALLVIPPAITPGTDQLGMFT 
HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 
SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 
TSBCDAREPWGARYIQTLKDHRPRMVWDSQVSE 
HFFEYKKSRSGRHWFYPTLKSLQVRLELARELG 
VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 
PWSE 


3746 


A 


1 


898 


IDRAAECRTKPLPMAVSIRGNADSIVACLVLMVL 

YLIKKJU.VACAAVFYGFAVHMKIYPETYILPITL 

HLLPDRDNDKSLRQFRYTFQACL*ELLKRLCNRT 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYF 

YHLTREUDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSEFVTFN 

KVCTSQYFLWYLCLLPLVMPLVRMPWKRAWL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQIISHYKEEPLTERIKYD 


3747 . 


A 


1 


2325 


MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKRKMTRAWCPDLKAVWKIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSN 

TKLDCSSFRENWDSDYVFGRKLAVGQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYLDDSEEKVH 

NRDSIKNFQKSSWIKQTGIYAGKKLFKCNECKK 

TFTQSSSLTVHQRIHTGEKPyKCNECGKAFSDGS 

SFARHQRCHTGKKPYECIECGKAFIQNTSLIRHW 

ryyhtgekpfdcidcgkafsdfflglnqhrrihtg 

ekpykcdvchksf\rygssltvhqrihtgekpYe 

cdvcrkafshhasltvqvhqrvhsgebcpfkckec 

gkafrqnihlashlrihtgekpfecaecgksfsis 

sqlathqrihtgekpyeckvcskaftqkahlaq 

hqkthtgekpyeckecgkafsqtthliqhqrvh 

tgekpykoviecgkafgdnssctqhqrlhtgqrp 

yeciecgkafktksslichrrshtgekpyecsvc 

gkafshrqslsvhqrihsgkkpyeckecrktfiqi 

ghlnqhkrvhtgersynykksrkvfrqtahla 

hhqrihtgesstcpslpstsnpvdlfpkflwnpss 

LPSP 


J /Ho 


A 

A 




1 


vjUY 1 KbGYDSACKDFVPHDLEVQIPGRVFLVTG 

gnsgigkataleiakrggtvhlvcrdqapaeda 

rge]mre\sgnqniflhrvdlsdpkkiwkfvenfkq 

ehklhvl\vnnagcmvnkjreahkkmdfeknfg 

cqysgvctflttrpdplcwrkntdprvit\vssg 

gmlvqklnnq*spvrkntiwmgtmvyaqnkvs 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Hrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£===Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I^Isoleucine, K=Lysine, I^Leucioe, MNMethionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Seriae, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UnknDwn, *=Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 










ERQQVVLT\ERWGPRAPG\IHFSSMHPGWA\DTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 


3749 


A 


1939 


715 


GFLRLSQAT\RQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDIL\MSSVKGLAENEENKGFLR 

NWSGEHYRFVXSMWMAR-RSYLAAFANHGQSF 

TLSVSHACCGYSHHQIFWIVDLLQMLEMNMAIA 

FPAAPLLTVTLALVGMEAIMSEFFNDTTTAFYIILI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQYSSLALVTSWLFIQHSMIYFFHHYE 

LPAJLQHVRIQXEMLLQAPTLGPGTPTAVLPDDMN 

NNSGAPATAP\DSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAIIT 

DASFLSGLSASLLERKPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 


3750 


A 


2 


844 


GLLEPFSKLLSFVIQNAVFTLAYLVELCGLCYRA 

FTKERDKFYLSRSWLELLQALKLKSPLPDTNLL 

LLVQFICADAGTKLAESTILSKQMIASVPGCGTA 

AMECWQYINEVLDFM\ADMHTLTBCLKSHMKTC 

SQPLHEDTFGGHLKVGLAQIAAMDISRGNHRDN 

KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 

SWLLLGSLTHNAVC/LKWPPLPGLPPLDAGSHV . 

ADHLIVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

WDSLLARQSGRW 


3751 


A ■ 


431 


2 


AFTRKCEETAFIVPQCEIIPTEAWCRRIPTGSSLER 
NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 
QLIAAKFGFAALGI/QTEVDIMSHAT* AVFEDPEKS 
RL\PQNCTPVDMKffiFGVHVTSKEILTDVIDNDS* 
RHSPS 


3752 


A 


131 


1278 


AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 

PGGRWTCQTSRRVSSDPAWAVEWIELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQWKDHV 

TKPTAMAQGRVAHLffiWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

WSSVDGEDSTDDSYDEDFAGGMDTDMAGQLPL 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 

DTLCSSLCSLEDGLLGSPARLA\PSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GWSLDEDEAEPEEQ 


3753 


A 


3 


1138 


YYSSVRQRVTCEEPRFRECAAALIEGSATEVYAG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

YENGLTPSEGSPELPSSPASSRQPWRPPACRSPLP 

PCjCjUCjOPI' sSPKA WPEbWGGAG AQAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLWGAVAL 

LDLSLAFLFSQLLT 


3754 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 
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SEQID 
NO: 


Method 

i 


Predicted 

beginnisg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nocJcotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PhenylalaniDe, G=Glycine, H=Histidine, 
I=Isolencine, K=Lysine, L=Leucine, M==Metbionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nncleotide deletion, 
^F'possible nucleotide insertion 










EDSERKWKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHE.ASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQNEAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKiEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 


3755 


A 


2 


3338 


SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKWKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

WKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALWLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHBLASFLRKNQRALRLATLA 

ALJJAL Ay syoi^bLrra A VQA VL AfiLPAL VNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nDcleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nncleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cj'steine, D=Aspartic Acid, 
E=Glutaniic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagint, P=Prolinc, Q=Glutaniine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codan, possible nucleotide deletion, 
\==possible nucleotide insertion 










EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGWAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFUSDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLDEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESrgjCDSTSAPSTDSMELS 


3756 


A 


112 


1361 


SLEEQQGRHPSFAPKCASQELGRIMITLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TRHRGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGP\DSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 


3757 


A 


413 


1 


PKPMLQQDFT/SLPDQGLDHIAE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKn 
QVIETIESNWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 


3758 


A 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPHFLGEPEDVYIVKNKPVLLVCKAVPATQIFF 

KCNGEWVRQVDHVIERSTDGSSGLPTMEVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

YIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 


3739 


A 


1 


561 


ADDTLHLWNLRQKRPAILHSLKFCRERVTFCHLP 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMWN 

KAIELSSKSHPGPWfflSDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

FICSHSDGTLTIWNVRSPAKPVQTITPHGKQLKD 

GKKPEPCKPILKVEFXTTR 


3760 


A 


1 


824 


LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRWLLYDEHGERRDKFSTKPADMKYGRKS 

YMVKGMAFSPDSTKIAIGQTDNIIYVYKIGEDWG 

DKKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYIIVFGLAEGKVRLS 

NTKTNKSSTIYGTESYWSLTTNCSGKGILSGHA 


3761 


A 


2253 


320 


PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSRKVF 

QLLPSFPTLTRSKSHESQLGNRIDDVSSMRFDLSH 

GSPQMVRRDIGLSVTHRFSIKSWLSQVCHVCQK 

SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 

RLRRTESVPSDINNPVDRAAEPHFGTLPKALTKK 



427 



wo 01/57190 



PCT/DSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K-Lysine, L^Leucine, M^Metbiooine, 
N=Asparagine, P=Proline, Q=Glntamine, R'^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, y=Tyrosine, 
X=Unknoivn, *-Stop codon, /=possible nucleotide deletion, 
X'^^possible nucleotide insertion 










EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPP\NPSP\GQR\DSRFNFPSC/AYFIHHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAJRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENWLFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

HAKGIVHKDLKSKNfVFYDNG\KWITDFGLF\GIS 

GWP\EGRRENQLKLSHDWLCYLAPEIVREMTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSVFSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKVVPRFERFGLGVLESSNPK 

M 


3762 


A 


2 


1578 


MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGWQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAVVLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRILTIDG/* 

PQIAVTLNGVDKILLFTTTSVINGSQWTFANPQV 

KTLFDEGWHQIRLLVTEQDVTLYIDDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 

PGYKGEPGRDGDK 


3763 


A 


3 


1267 


CKVWR>JPLNLFRGAEYNRYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAMQKAWRE 

RNPQARISAAHEALEINECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQ\TRHQCLGVHQKKASNVCQKTRE 

DQGSSENDERFNEGVPPSEYVQYP*KPF\KALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPEHDLKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 


3764 


A 


25 


1032 


RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 
KKNIKKKKYVNSGTVTLLSFAVESECTFLDYIKG 
GTQINFTVAIDFTASNGNPSQSTSLHYMSPYQLN , 
AYALALTAVGEnQHYDSDKMFPALGFGAKLPPD 
GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

VV^L.iLirirNrArV V IJtlVAKJNAAAVl^UOol^YbVL 

LnTDGVISDMAQTKEAIVNG\SKLPMSIirVGVGQ 
AEFNAMVELDGDDVRISSRGKLAERDIVQFVPFR 
DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 
KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 


3765 


A 


172 


3456 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQID 
NO: 


Method 


Predicted 

beginning 

Ducleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqnence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid seqnence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Add, F^PheDylalanine, G=Glycine, H=Histidine, 
I^Isoleacine, K^Lysine, L^Leucine, M-Metbionlne, 
N=Asparagine, P=ProUne, Q=Giutaraine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^ossible nucleotide deletion, 
V^ssible nucleotide insertion 










KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ESIFLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYBaDSLLLANSKKTRNYIAJDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDCILSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLBCLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFmVEPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQVVNTNMQSVQLNTEDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPIDIA 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSHIPPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTDEFDEFFSSSALNALANDTLDLPHFDE 

YLFENY 


3766 


A 


3 


1622 


AQQIVYKNVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFA^SSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3767 


A 


3 


1622 


AQQIVYR>JVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEDCSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 
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SCQID 
NO: 


Method 


Predicted 

beginning 

DDcleotide 

location 

corresponding 

to Crst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIntamic Acid, F^Phenylalanine, G^lycioe, H=Histidine, 
I^Isoleucine, K=i,ysine, L=Leucine, M=^ethionine, 
N=Asparagine,P^ProIine, Q=Glotamine,R=Aiiginine, S=Serine, 
T=Threonine, .V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 










ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSF/VHSSRLIRHQR 

THTGEKPYECPECGKSFRQSTHLILHQRTHVRVR 

PYECNECGKSYSQRSHLWHHRfflTGLKPFECBOD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDHk 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 


3768 


A 


185 


2258 


SIIIKMSRKISKESKKVNISSSLESEDISLETTVPTD 

DISSSEEREGKVRITRQLffiRKELLHNlQLLKIELS 

QKTMMIDNLKVDYLTKffiELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSDPEYVSVRFYELVNPLRKEICELQV 

KJKNILAEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEILEASHMIQTKERSELSK 

EWTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQEIDQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKRTTELQAQNSEHQARLDIYEKLEK 

ELDEHMQIAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 


3769 


A 


3 


2297 


DAAEFRVVADAMKVIGFKPEEIQTVYKBLAAILH 

LGNLKFWDGDTPLIENGKVVSIIAELLSTKTDM 

VEKALLYRTVATGRDIIDKQHTEQEASYGRDAF 

AKAIYERLFCWWTRIMDIffiVKNYDTTIHGKNTV 

IGVLDIYGFEFDNNSFEQFCINYCNEKLQQLFIQL 

VLKQEQEEYQREGDPWKHIDYFNNQIIVDLVEQQ 

HKGIIAILDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFIDKNKDTLFQDFKRLMYNSSNPVLKNMWP 

EGKLSITEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCKPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDLPSDKEAVKKLffiRCGFQDDVAYGKTKIFIRT 

PRTLFTLEELRAQMLIRIVLFLQKVWRGTLARMR 

YKRTKAALTIIRYYRRYKVKSYIHEYARRFHGVK 

TMRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 

ASQLKSIPASDLPQVRAKVAAVEMLKGQRADL ' 

GLQRAWEGNYLASKPDTPQTSGTFVPVANELKR 

KDKYMNVLFSCHVRKVNRFSKVEDRAIFVTDRH 

L Y J<JVLUr 1 KV^ I K V ftlN. 1 Ir J-. Y N L 1 0 Lb V o^l Ol^^ 

YVFHTKDNKDLIVCLFSKQPTHESRIGELWGVLV 

NHFKSEKRHLQWNVTNPVQCSLHGKKCTVSVE 

TRLNQPQPDFTKNRSGFILSVPGN 


3770 


A 


3 


6276 


HKVAAPDVWPTLDTVRHEALLYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEWGLNFSS 



430 



wo 



01/57190 



PCTAJSOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=Phenylalanine, G^Glyclne, H=Histidine, 
I=Isolencine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamiae, R=Arginine, S=Serine, 
T=nireonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=lInluiown, *=Stop codon, /=possible nncleotide deletion, 
\=^ssible nucleotide insertion 










ATTPELLLKTFDHYCEYRRTPNGWLAPVQLGK 

WLVLFCDEINLPDMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

HRFLRHVPVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRIWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLRIDRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCKNEKIAFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVre. 

NLHWFTMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 

PWYDKLPQPPSHREAIVNSCVFVHQTLHQANA 

RLAKJIGGRTMAITPRHYLDFINHYANLFHEKRSE 

LEEQQMHLNVGLRKIKETVDQVEELRRDLRIKS 

QELEVKNAAANDKLKKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWKQIRSnMRENFIPTIVNFSAEEIS 

DAIREKMKKNYMSNPSYNYEIVNRASLACGPMV 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

AKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMSTIAGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 

LRWQASSLPADDLCTENAIMLKRFNRYPLIIDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDBDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

XnTTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSnnCDLFQVA 

FNRVARGMLHQDHITFAMLLARIKLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

WRLSCLPAFKDLIAKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

NLLRAGRIFVFEPPPGVKANMLRTFSSIPVSRICK 

SPNBRARLYFLLAWFHAHQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

FTTRSFDSEFKLACKvnOGHKDIQMPDGIRREEFV 

QWVELLPDTQTPSWLGLP2WAERVLLTTQGVD 

MISKMLKMQMLEDEDDLAYAETEKKTRTDSTS 

DGRPVAWMRTLHTTASNWLHLDPQTLSHLKRTVE 

NIKDPLFRFFE\REVKMGAKLLQ\DVRQDLADV\V 

QVCEGKKKQTNYLRTLI\NELV\KGILP\RSWSHY 
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SEQID 
NO: 


Method 


Predicted 

begiDning 

nacleotide 

locatioD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nndeotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glufaniic Acid, F^benylalanine, G=Glycine, H^'Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutaroine, R=Ar5inine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stdp codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










1 VrA(j\Ml Vlv^Ww VrloAKKl\ls.V^Ll^NloL\AAASCj 

GAKELBMHVCLGGLFVPEAYTTATRQYVAQAN 

SWSLEELCLEVNVTTSQGATLDACSFGVTGLKL 

QGATCNNNKLSLSNAISTALPLTQLRWVKQTNT 

EKKA.SWTLPVYLNFTRADLIFTVDFEIATKEDPR 

SFYERGVAVLCTE 


3771 


A 


1 


2043 


LPLLHAGFNRRFMENSSHACYNELIQIEHGEVRS 

QFKLRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHFVSLKKLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKESIDVKSISSRGSDA 

PSLQNRRYPSMARIHSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEBLRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITINDVPPCISQLLDNEESWDFNIFELEAITH 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTflAADVLHATAFFLGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFL\C 

NAGSELAVLYND'RAV\LESHHTALAFQ\LTVKDT 

K\a«in^KNro/RGNHYRTLRQAiroMVLATEMTKH 

r bHVNKr VNblNJvPMAAElEOSDCECNPAGKNFP 

ENQILIKRMMIKCADVANPCRPLDLCIEWAGRIS 

EEYFAQTDEEKRQGLPWMPVFDRNTCSIPKSQI 

SFIDYFITDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 


nil 


A 


1013 


50 


TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 
HELIKEAEIIQGIMALLTRTLEEASEQIRMNRSAK 
YNLEKDLKDKFVALTIDDICFSLNNNSPNIRYSEN 
AVRIEPNSVSLEDWLDFSSTNVEKADKQRNNSL 

JVLLJsAL V UxKJJjOl^ 1 AJN y LKJs.t^ 

KDTKDARDQLADHLAKWMEEIASQEKNITALEK 

AILDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 

YRLMKEVQEITHNVARLKEILAVQAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSIPLR 

DGEDHGVWAGGLRPDAVC 


3113 


A 


I 


955 


AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 
QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 
LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 
DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

oClVLLLuOrKK 1 lJJlJrljiiljYL,\L5>rl(^KlCKYrL.LLKE 

LAKRTPGKHPDHPAVQ\SALQAMKTVCSNINETK 

RQMEKLEALEAAA/QSHIEGWEGSNLTDICTQLL 

LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 

TGSKKSTKRTKSINGSLYIFRGRINTEVMEVENVE 

DGTGSPSPSLA 


3774 


A 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLI 
RVDGKGSIKELFPTGKQLEPLVAPLADGKVAVG 

ODDT TVVT KrFFfrTPTnKTAT ■MWmrPVAMFHnp 

PYUAVLPRYVEIRTFEPRLLVQSIELQRPRFITSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHIKl^YAFNLFC 

QKRFDESMQVFAKLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

con-espoading 

to first amino 

add residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, U>=Histidine, 
I=Isoleucine, K=Lysine, L=LeHcine, M=Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginlne, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Cnknovvn, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










QLVKKLNDSDHQSSTSPLMEGTPTIKSKKKLLQn 

DTTLLKCYLHTNVALVAPLLRLENNHCHIEESEH 

VLKKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHIIHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFIYVHILIGjTOMAEEYCHKHYDRN 

KDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPK 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN 

DIRIFLEKVLEENAQKKRFNQVLKNLLHAEFLRV\ 

QEERILHQQVKCnTEEKVCMVCKKKIGNSAFAR 

YPNGWVHYFCS\KEVNPADT 


3775 


A 


1832 


839 


MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTWMSRARQQTFIFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

VYLGRPSLDHPIEATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVGYIVEGLLGARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNWNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 


3776 


A 


3 

• 


796 


PRAKLGTRAKNMAGQDAGCGRGGDDYSEDEGD 

SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 

LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 

PTEVKIQEMTKLGHELMLCAPDDQELLKGCACA 

QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 

REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 

PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 

EbAAKLHALRTEYFAQHEQGAAAGAANTSAP 


3777 


A 


3 


413 


SEEDVIEGKTAVIEKRRKKRSSAGVVED/IGGEVQ 

NMLEGVGVDINKALLAKRKRLEMYTKASLRTSN 

QKIEHVWKTQQDQRQKLNQEYSQQFLTLFQQW 

DLDMQKAEEQEEKILVGIMIRFIINQVSSRNGQPS 

LLL 


3778 


A 


132 


788 


SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTHQPIVRPPAAARTMWVNPEEVLLANALWITE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACGVGSR 

KEITEHWEWLEQhfLLQTLSIFENENDITTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 


3779 


A 


2 


934 


CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

V JLfl^O W LLJI\.Vji-i V IN IVJvVO/VlN I oivi V V^o AIN oxl V KJxJL 

MNEKATHEQEMEEAKENFKNALTGILTQFEQIV 
AVFNASTRQKAWDHFSKAQRKNIDIWAK\HSEE 
LKNAQSEQLMGIRREEEMEMSDDENCDSPTKKM 
RVDESALGAP 


3780 


A 


1 


2535 


AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine OCysteine, D=Aspartic Add, 
I>=Gl.utamic Acid, F==Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=6top codon, A^possible nucleotide deledon, 
y^xi^ible nucleotide insertion 










HRAGSRDCLPPAACFRRRRLARRPGYMRSSTGP 

GIGFLSPAVCnLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEffiALQARMFVLEAKDQQLRRE 

lEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

EITTKVCMSEKFCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKSNTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPPIPPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 


3781 


A 


3 


995 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3782 


A 


1 


2649 


FRVPDSCPWLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFLCTYRAFTTTQQVLDLLFKRYGRCi)ALTA 

SSRYGCELPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEPIEAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPWAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKWP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARWEHWIEVAREC 

T?TT V~Kn7QQT VATT QAT r%CXTOTTJT>T T/'V'TOrCFW/CtJTVO 

FRIFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPKRAQKRPKETGHQGTVPYLGTFLTDLVML 

DTAMKDYLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nacleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amiiio 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Add, F=Pbenyla]anine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, D^Leucine, M°°Methionine, 
N=Asparagine, P=Proline, Q=Glutaininc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A^ossible nucleotide deletion, 
V=pas5ibie nucleotide insertion 










TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKKFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCIIRVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPENANVFYAMNSTANYDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 


3783 


A 


3 


869 


RSGQGKVYGLIGKRRFQQMDVLEGLNLLITISGK 

RNKLRVYYLSWLRNKILHNDPEVEKKQGWTTV 

GDMEGCGHYRWKYERIKFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIYIPVHIQSQIT 

PHAIIFLPNTDGMEMLLCYEDEGVYVNTYGRIIK 

DVVLQWGEMPTSVAYICSNQIMGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRNCIMNW 


3784 


A 


1213 


457 


LSPRQVDGLAGLQKGLSLSLLYQFLMNGERLGTY 

GLAEAGGYLHTAEGTHSPARSAAAGAMAGVMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQLCTFSSTKDLLSQWEIFPPQSWKIALVAAMM 

SGIAWLAMAPFDVACTRLYNQPHRCTGQGP\LY 

RGJLDALLQTARTEGBFGMYKGIGASYFRLGPHTI 

LSLFFWDQLRSLYYTDTK 


3785 


A 


193 


813 


RRRGRHSLCGGKMLAYCVQDATWDVEKRRNP 

SKHYVYIINVTWSDSTSQTIYRRY\SKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGKILFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPHISQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHWNCVTQKCLFVFHFKFSSSGNKE 

SKSL 


3786 


A 


3785 


1632 


EFVGRAASTTVVTRIAWRMADAGIRRWPSDLY 

PLVLGFLRDNQLSEVANBCFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKAAVPAKRVGL 

PPGJCAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQBCPKimVTVKAQTKAPPKPARAVAPKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAKAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAWSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

KQNbAAJCEAETPQAKKIKLQTPNlFPKRKKGEK 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQVLKFTKGKSFRHEKTKKKRGSYRGG 

SISVQVNSKFDSE 


3787 


A 


3 


5078 


IPEG/RALSAEHTSSLVPSLHITTLGQEQAILSGAV 
PASPSTGTADFPSILTFLQPTENHASPSPVPEMPTL 
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SEQD) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, I>=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^lycine, H~Histidine, 
I=°Isoleucine, K=Lysine, I/=Lencine, M=Metbionine, 
N=Asparagine, P=4'roUne, Q=GIutamine, R=Arginine, S=^rine, 
T='nireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X'^Unknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertioo 










PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KKDSVTAILGKNEEANVTIPLQAFPRKEVLSLHT 

VNGFVSDFSTGSVSSPnTAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESnSGLQQQTNYDLNGHTISTTS 

WETHLAPTAPPNGLTSAADADCSQDFKDTAGHS 

VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

PAKSSSMTTIAK^nmjKAASGPKRTPGAVHTAF 

PFITTYMYARTGHTTSTHTA/IARKHGHCLWPVV 

YNLP/PP/GKPQAMHTGLPNPTNLEMPRASTPRPL 

TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAWIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHHWLLQAD 

PWKNPPNNLWIIAAVLAPIAWTVmilTAVLCR 

KNKNDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVNKALKQKSDIEHYRNKL 

RLKAKllKGYYDFPAVETSKGLTERKKMYEKAP . 

KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 

EPEDEETNIDRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEWTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

LHDSASFTQMSRGPVSVTQLDQSALNYSGNTVP 

AVFAffAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYIEAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLTOISTAALVKAIREEVAKLAKKQTDMFEF 

QV 


3788 


A 


2 


1737 


MKGLYTDAEMKSDNVKDKDAKISFLQKAIDW 

VMVSGEPLLAKPARIVAGHEPERTNELLQnGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKNVREEESRVHKNTEDRGDAEDCERSTSRD 

RKQKEELKEDRMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaiiine C=Cysteine, D=Aspartic Acid, 
IXSIntamic Acid, F=PhenylaIanine, &=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr^Leucine, M^Methionine, 
rii=Asparagine,P=Proline, Q=Glutaniine, R=Arginine, S=Serine, 
T=nireonine, V=Vallne, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 










KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKKLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNVITESHNSDNEEDDQFWEA 

APQLSEMSEIEMVTAVELEEEEKHGGLVXKILET 

KKDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILBCMEEKJQ 

KMVYSINLTSRR 


3789 


A 


1 


4369 


MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDFNWEQVNTLTKPTSDPWMPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVrrSGHQG 

YLAroEVKVLGHPCTRTPHELRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGroVRDAPLKEIKVT 

SSRRFIASFNWNTTKRDAGKYRCMIXRTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLWIQLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEWEVKSRQITIRWEPFGY 

NVTRCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYTNVSVKLE.MNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTKISAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 

DNKTYNGYWNTPLLPYKSYRIYFQAASRANGET 

KIDCVQVAIXGAATPKPVPEPEKQTDHTVKIAG 

VIAGILLFVIIFLGVVLVMKKRKLVAKKRKETMSS 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

THTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMKNRYGNIIAYDHSRVRLQT 

lEGDTNSDYENGNYnDGYHRPNHYIATQGPMQET 

IYDFWRMVWHENTASIIMVTNLVEVGRVKCCK 

YWPDDTEIYKDIKVTLIETELLAEYVIRTPAVEKR 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLWHCSAGAGRTGCFIVIDIML 

DMAEREGWDIYNCVRELRSRRVNMVQTEEQY 

VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQDCEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDILPPDRCLPFLITIDGESSNYINAALM 

DbYK.QPbAJUVTQHPLPN I VKIJFwRLVLDYHCTS 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDnSRJFRTYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLIRQVDKWQEEYNGG 

EGRTWHCLNGGGRSGTFCAISrVCEMLRHQRTV 

DVFHAVKTLKNNKPNMVDLLDQYKFCYEVALE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Add, F=Pbenylalaoine, G^lycine, H=Histidine, 
I==Isoleacine, K=LysiDe, L=Lendne, M=Meti]iODiae, 
N=Asparagine, P=^roline, Q=Glutamine, R=Arginine, S=Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=lInknown, *=Stop codon, ^=possible nudcotide deletion, 
\=possible nudeotide insertion 










YLNSG 


3790 


A 


261 


485 


EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 


3791 


A 


1 


5874 


LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

YYLQGNYLRCVAEVGSFEHNLTTDLLNHLVFVQ 

KVFMKEVNEVIQKVSGGEQPIPLWNEHDGTADG 

DKPKILLYSLNLQFKGIQVTATTPSMRAVRFETG 

LIELELSNRLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVLITLNRPIVYAQPVAFDRAVL 

FWLNYK\AAYDNWNEQRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTGSALVLTIESTLITACSSESLVSK 

GHFKOTCIRFADGFETSWDDWKPEIHGDLVMNA 

CVVPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGmVHlvnDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKRIVDIRELNEQAKV 

IDDLKKLGASEGTINQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKKKKFQTNYASTTHLMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKQSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEES\TTSLVS\SSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSSNRGELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGEPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLI 

MSAVCDIGSASFKYDMRRLSEILAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLWFAINLKQL 

NVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAmSEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLKWDIFQVMISRSTTPDLIKIGMKLQEFFT 

QQFDTSKRALSTWGPVPYLPPKTMTSNLEKSSQE 

QLLDAAHHRHWPGVLKWSGCmSLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNIAFWTEAQKIWEDGSSDHSTYTVQTLDF 

HLGHNTMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFNYVTATRNEELNLLRNVDAN>fT 



438 



wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 


Method 

- 


Predicted 

beginning 

nucleotide 

locatloD 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F==PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=AsparagiBe, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion 










ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKPKVECSWTEF 

TDHICVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RILSTRPGQKSPIIIHDDNSSDKDREDSITYTTVDW 

RDFMCNTWHLEPTLRLISWTGRKIDPVGVDYILQ 

KLGFHHARTTIPKWLQRGVMDPLDKVLSVLIKK 

LGTALQDEKEKKGKDKEEH 


3792 


A 


1 


364 


QNGSTPLHHAASKNRHEIALNILLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHILLyYKASTnQDT 
EGNTPPHLVCDnRVEEAKLLVSQGA/SIYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 


3793 


A 


2 


340 


DIVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 
DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 
PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 
KSGPASRPAL 


3794 


A 


421 


158 


SYWVGEDYTYKFFEVILIDPFHKAIRKNPDTQWI 
SKAVYKHREMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 


3795 


A 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDGTLE 

RGHWNNKMEFVLSVAGEUGLGNVWRFPYLCYK 

NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEGIGYASQMIVILLNVYYnVLA 

WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 


3796 


A 


3 


592 


KPASTYSTSQPSMAPLLPIRTLPLILILLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFVVPPCRGRRELVSWDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSREIPMFTLPRRNMESIGLGMARTGGMWI 

TVLLSVAMFLLVLGFIIALALGSRK 


3797 


A 


1 


1556 


ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

PLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVGILINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTWALLADWLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GPTPIPELTHIMVGLESCSFLEEDFIPFAVLNMMM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHWMYN 

ATSYHHSYEDTGLLCIHASADPRQVREMVEnTK 

EFILMGGTVDTVELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASICMLRGKPAVAALGDLTDLPTYEfflQTAL 

SSKDGRLPRTYRLFR 


3798 


A 


73 


759 


KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 

QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 

LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LAaKrMA YHiyr'LKNSQD YTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3799 


A 


73 . 


759 


KRLVEAG VPRTFDGIVGEGGAQSRSCWPWG VTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alaniae OrCysteine, D^Aspartic Acid, 
EXSlutamic Add, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K<=Lysine, L^Leudne, M'=Metliionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nudeotide ddetion, 
V=possible nudeotide insertion 










LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

MILVDTVGFMALWGISYNAVSLINLVS 


3800 


A 


250 


1032 


GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 

TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

ADVAVFEAVSSPPPADLCHALRWYNHIKSYEKE 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 

DDDE)LFGSDDEEESEEAKRLREERLAQYESKKA 

KKPALVAKSSELLDVKPWDDETDMAKLEECVRS 

IQADGLVWGSSBGLVPVGYGIKKLQIQCVVEDDK 

VGTDMLEEQITAFEDYVQSMDVAAFNKI 


3801 


A 


155 _ 


656 


SREMELVTFRDVAIEFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

LKIHETAAKPPAICSPFSQDLSPVQGEDSFHKLE. 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKEFQCNTCVRVFSTSSHSNKHK 


3802 


A 


1 


1428 


VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRKWEVGKDL 

PAMLRLLRSLVFPGGKKPGTITECGEDIRSQKSH 

YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGKYSLVQHQRVHTGERPWECNECGKF 

FSQTSHLNDHRRIHTGERPYECSECGKLFRQNSS 

LVDHQKIHTGARPYECSQCGKSFSQKATLVKHQ 

RVHTGERPYKCGECGNSFSQSAILNQHRRIHTGA 

KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 

GDCGKSFSQSSILIQHRRIHTGARPYECGQCGKSF 

SQKSGLIQHQVVHTGERPYECNKCGNSFSQCSSL 

IHHQKCHnT 


3803 


A 


193 


617 


LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 

QHLKRLKKSGLGHLKWTKAEDIDIETPGSILVNT 

NLRALINKHTFASLPQHFQQYLLLLIJPEVDRQMG 

SDGILRLSTSALNNEFFAYAAQGWKQRLAEGKF 

VFSIM 


3804 


A 


197 


479 


SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 
1 GbobbrM bb W V abPLyFbOLbOoSRMKvjub A 1 KJ 
LLETLLLAAHMTADQGIASSQRCLL 


3805 


A 


1 


385 


QSADTLFPGDINFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPRPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 


3806 . 


A 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGhJLLQ 

PQAPGHDMTSBPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SSN* 

KRIANGLGFSrVQMEKiiSCSHLKSDLVRiKjlLFF 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL 

RPESSQKAIREGTMGAKTERDLGPVP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Plienylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M^^cthionine, 
N=Asparagine, P=ProIine, Q=Glntanune, R=Arginine, S=Serine, 
T=nireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\rn, *=Sfop codon, /possible nucleotide deletion, 
V=possible nucleotide insertion 


^ OAT 

380/ 


A 


656 


1238 


RCPSLLPPSWPLPTLQTLTRTPGNKAIAGGAGLW 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDWRLWGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 

EDLARMYRDDITVTWYFNSESIGAYYKGG 


3808 


A 


26 


2195 


SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 

MLRFKKTBFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

KIFRMEIFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFimSPDFEAAKFWVGNMGKTATHAWF AKL 

CVPGDQCHGLHPFIVQIRDPKTLLPMPGVMVGDI 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFKDVRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAAYKWLVCYLLRETYQKLNQEKRSGSSDFEAR 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDVIAP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 


3809 


A 


117 


830 


CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTMAS 

SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

APYCTRSQRVSENTMLPFVSNRTTFFTRYTPDDW 

YTR5NLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGERVNDIGFWKSEIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

HREKRMGIDLVHDEVEAQLLTVNVGEMHQSQA 

A 




A 


3 


518 


VIQELEGGSGADLGEHSCRPASQPRFPRPAEARS 
HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 
FSEQELKQWYKGFLKDCPSGILNLEEFQQLYIKF 
FPYGDASKFAQHAFRTFDKNGDGTIDFREFICAL 
SVTSRGSFEQKLNWAFEMYDLDGDGRITRLEML 
EIIE 


3811 


A 


81 


1147 


GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE 

RADATNVNNWHWTERDASNWSTDKLKTLFLAV 

QVQNEEGKCEVTEVSKLDGEASINNRKGKLIFFY 

EWSVKLNWTGTSKSGVQYKGHVEffNLSDENSV 

DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 

MGIYISTLKTEFTQGMILPTMNGESVDPVGQPAL 

KTEERKAKPAPSKTQARPVGVKIPTCKITLKETFL 

TQPT7PT VP \7T7TTr^1-T Vr» A 17TTJ A T> A TT V ATW nr^JiT'T: 

HMVDGNVSGEFTDLVPEKHIVMKWRFKSWPEG 
HFATITLTFIDKNGETELCMEGRGIPAPEEERTRQ 
GWQRYYFEGIKQTFGYGARLF 


3812 


A 


20 


558 


PCGTAASTHAYDRRAKCRQQQQQQQNGGQNKV 
RPAKKKTSPAREVSSESGTSGQFIPPSSTSVPTIAS 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OrCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H'^Histidine, 
I=Isoleucioe, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Argimne, S=Serine, 
T=Threonine, V=Valine, ■W=Tryptophan, y=Tyrosine, 
X=Unknown, *^top codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 










SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 


3813 


A 


1 


1016 


CTEPPRRSTRTPAALASLRPYTDYVWSDQILQES 

EDFFTLmSHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPP. 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 


3814 


A 


2 


884 


VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CCIPHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIWQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKP 

VANHQYNIEYERKKKEDGKRARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPWYLKEILKEIG 

VQhfVKGIHKNTWELKPEYRHYQGEEKSD 


3815 


A 


17 


411 


NIGDWEDIGKSPERIIQYYGPATWAQDGSRGYCT 
PIYMLNHIIRLQAVLEIIMNERANALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV*GJCFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 


3816 


A 


3 


1172 


SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIWFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARILQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

EESEFREGVVVDRPTRPGHGSFVNCGMKKEVKI 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

TIRTEEAILISLAALQPGLIQAGARHT 


3817 


A 


246 


1197 


FLSAGMSNFTHYAYLLMffiSLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVIIQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTIPPGDSGAGSNLI 

TMFLRNRKETDLCSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTIKHAVALESRHQKGELQC 

LIKMCIPLSKPLQMFFSPPHWEAWLQRVQQLAK 

NTRYFRQRLQEMGFnYGNENASWPLLLYMPG 

KVAAFARHMLEKKIGWWGFPATPLAEARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELED 


•3010 

3S18 


A 


215 


789 


NPQSSSSEGSSEIFQVNGHNRLLVQRSEVTQAPG 

QYTVDVEGHGCTFIQATLKYNVLLPKKASGFSLS 

LEIVKNYSSTAFDLTVTLKYTGIRNKSSMVVIDV 

KMLSGFTPTMSSffiELENKGQVMKTEVKNDHVL 

FYLENVFGRADSFTFSVEQSNLVFNIQPAPGMVY 

DYYEKEEYALAFYHINSSSVSE 
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SEQID 
NO: 


Method 


Predicted 

beginniDg 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutaroic Acid, F=J>benylalanine, G^Glycine, H^Histidine, 
I=Isoleucine, K==Ly5ine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, CHGIutaniine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
V^possible nucleotide insertion 


3819 


A 


1 


1483 


MPDSnSRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RRHRMSPFPLTSMDKAFITVLEMTPVLGTEIINYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFnGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAWQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPWAVMSTGNELLNPED 

DLLPGKIRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLNALNEGISRADVIITSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLPTTFATLDIDGVR 

KIIFALPGNPVSAWTCNLFWPALRKMQGILDP 

RPTIIKARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEWDVMVIGRL 


3820 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNnKGNEEGYFGTRRLNAYTGVVYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 


3821 


A 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYtCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGILCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCBCDVDECALGTHNCSEAET 

CHNIQGSFRCLRPECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLwRQGSVTTFLAKMHI 

FFTTFAL 


3822 


A 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALWKALKAFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 
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SEQID 
NO: 


Mefiiod 


Predicted 

beginning 

nocleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Clutainic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysinc, LF^Leudne, M°°MetiiiODine, 
N=Asparagine, P=Proline, (>==Glutamine, R=Argimne, S=Serine, 
T=nireonine, V=Vaiine, W=TryptopIian, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\FpossibIc nucleotide insertion 










oxlUlK I isJvUL.is.!5 W V y UNi^ 1 AUOKoLr JJ" DEMDK 

MPPGLMEVLRPFLGSSWVVYGTNYRKAIFIFISN 

TGGEQINQVALEAWRSRRDREEILLQELEPVISR 

AVLDNPHHGFSNSGIMEERLLDAWPFLPLQRHH 

VRHCVLNELAQLGLEPRDEWQAVLDSTTFFPE 

DEQLFSSNGCKTVASRIAFFL 


3823 


A 


1 


3174 


ygcekttegriplkniyrlfsadrkrvetaleac 

slpssrndsipqedftpevyrvfLnnlcprpeidni 

fsefgakskpyltvdqmmdfinlkqrdprlneil 

ypplkqeqvqvliekyepnnslarkgqisvdgfm 

rylsgeengwspekldlnedmsqplshyfinss 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGVPLPSPMDLMYKILVKN 

KKKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLVNYIQPVKFESFEISKKRNKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDSSNYMPQLFWNAGCQMVALNFQTMDLA 

MQINMGMYEYNGKSGYRLKPEFMRRPDKHFDP 

ftegjvdgrvantlsvkiisgqflsdkkvgtyvev 

dmfglpvdtrrkafktktsqgnavnpvweeepi 

vfkkwlptlaclriavyeeggkfighrilpvqai 

rpgyhyiclrnernqpltlpavfvyievkdyvpd 

tyadviealsnpiryvnlmeqrakqlaaltlede 

eevkkeadpgetpseapsearttpaengvnhttt 

ltpkppsqalhsqpapgsvkapaktedliqsvlte 

veaqtieelkqqksfvklqkkhykemkdlvkr 

hhkkttdlkehttkyneiqndylrrraaleks 

AkKDSKKKSEPSSPDHGSSTffiQDLAALDAEMTQ 

klidlkdkqqqqllnlrqeqyysekyqkrehik 

lliqkltdvaeecqnnqlkxlkeicebcekkelkk 

kmdkkrqekiteakskdksqmeeektemirsyi 

qewqyikrleeaqskrqeklvekhkeirqqild 

ekpklqveleqeyqdkfkrlpleilefvqeamkg 

kisedsnhgsaplslssdpgkvnhktpsseelggd 


3824 


A 


1 


426 


ilhwfvhrwsgrnnrekigvhvgfeeilnmepy 
ccretlkslrpecfiydlsavvmhhgkgfgsgh 
ytaycynseggfwvhcndsklsmctmdevcka 

r\A VTT PVTT^D\/T'l?XTniJO'L^T T T T O O i^tm'KTCTN 

liL/Jr I lyKV IJllNUnaJiJLLrrbLLLtiotjxirlNtD 

adtssneils 


3825 


A 


3 


364 


girakfpnkipvwerypretflppldktkflvpq 
eltmtqflsiirsrmvlrateafyllvnnkslvs 
msatmaeiyrdykdedgfvymtyasqetfgcle 
saaprdgssledrplhpl 


3826 


A 


1 


1237 


pekkferecreaekaqqsyerldndtnatkadv 
ekakqqlnlrthmadenkneyaaqlqnfngeq 

dserkvepiiskclegmilaaksvderrdsqmw 

dsfksgfeppgdfpfedysqhiyrtisdgtisaskq 

esgkmdakttvgkakgklwlfgkkpkgpaled 

fshlppeqrrkklqqridelnrelqkesdqkdal 

nkmkdvyeknpqmgdpgslqpklaetmnnidr 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location ' 
corresponding 
to first amino 
acid residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H^Histidine, 
I=Isoleucine, K=Lysine, I^Leucine, MHVlethionine, 
N=Asparagine, P=Proline, Q^GIutamlne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=^possibIe nucleotide insertion 










LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDINH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDPLPAIGHCKAIYPFDGHNEGTLAMK 
EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 


3827 


A 


2 


1584 


ENPVSSAVKGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLIjyKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAEIVYAATTSLRQA 

NQEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQFDTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CLSELNTWHGRPEEKWNPQTSSFLQVLVSVQSLI 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIRNPSPCFKEVIHKHFYLKRVEIM 

AQCEEWIADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSSSKELPSDFQL 


3828 


A 


1415 


845 


PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACr 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 

SEFGIlMSEFPLDPQLSKSrLASCEFDCVDEVLTIA 

AMVTGILNDYSFSFFANLH 


3829 


A 


199 


683 


VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD 
RDQFSPDDFLGRTEIPVAKIRTEQESKGPMTRRLL 
LHEVPtGEVWVRFDLQLFEQKTLL 


3830 


A 


1747 


404 


RKMMEESGIETTPPGTPPPNPAGLAATAMSSTPV 

PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 

SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 

NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 

DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 

ASLTSLAQGTGTTSAITFPEEQEDPRITRGQDEAS 

AGGIWGFEKGVAGNPMVKSVLDKTKHSVESMT 

TLDPGMAPYKSGGELDIVVTSNKEVKVAAVRD 

AFQEVFGLAVWGEAGQSNIAPQPVGYAAGLKG 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLWEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TDWHMAFTGMSRRQMIYSAARAIAGMYKQRLP 

PRTV 


3831 


A 


5 


674 - 


FWTRSAWHEGLQQMKANDPSLQEVNLYNIKNIP 
IPTLREFAKALETNTHVKKFSLAATRSNDPVAIAF 
ADMLKVNTTLTSLNIESHnTGTGILALVEALKEN 

UlLil xiJJsXL/iN l^Kv^i^L»0 1 A V liMjbjLAC^MLcliN oivLL 

KFGYQFTKQGPRTRVAAAITKNNDLAWQKDTQ 
EQTSIWQWSQSIAGFNPQFEVQGQNARSWMEE 
LGKAFHQFVRRELKQTEGKLP 


3832 


A 


164 


782 


EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence {A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F^^Pbenylalanine, G=Glycine, H=Histidlne, 
I^Isoleucine, Kr^hys'ine, L=Leucine, M==Metbionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=^erine, 
T=Threonine, V=Valine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
\=possible nucleotide insertion 










EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 


3833 

- 


A 


122 


1676 


SQPPHFTQKM>IENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTKRHSDPDKSLQDEVSPRKNDnSVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRYIM 

EKIVQANKLLQNQEPVNDKRERKLKFKDQLVDL 

EVPPLEDTTTSKNYFENEKNMFGKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQDIASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKIHAV'IHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTQSAmSPVTST 

YCLSPRQKELQKQLEEKREKLKREEERRKIEEEK 

EKKRENDIVFKAWLQKKREQVLEMRRIQRAKEI 

EDMNSRQENRDPQQAFRLWLKKKHEEQNDCERQ 

TEELRKQEECLFFLKGTEGRERAFKQWLRRKRM 

EKMAEQQAVRERTRQLRLEAKRSKQLQHHL YM 

SEAKPFRFTDHYN 


3834 


A 


575 


774 


RSRTEELSNSGILKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQKNLYRKVMLENYRSLVSLGKDMSP 


3835 


A 


2 


100 


ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 


3836 


A 


91 


749 


RPTPGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYVV 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 

HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVSVYPAPAGLHIRRLVGLVLGTISEVS 

REPCFSSSSCWSCVAIKI 


3837 


A 


3 


1214 


SLGCTOSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDVVELLIKYGADVHAFSKFDKSAFD 

lALEKIWAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEWNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

ANTEEIIEGNSVDSSIQQVMGSGGQRVITrVTDGV 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEAIARQQPNGVDFTMVEEVAEVDAW 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 


3838 


A 


1 


1332 


MIEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKffQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDIFHLLKSHTNVLSVNLPDlSlFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

EOJNVYRKRRKYFADLAMNYKHGDPIPKVEFTEE 

EIKTWGTVFQELNKLYPTHACREYLKNLPLLSKY 

CuYJlbUNlPQLbDVbNFLKERTGFSIIlPVAGYLSP 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

LKHALSGHAKVKPFDPKTTCKQECLITTFQDVYF 

VSESFEDAKEKMREFTKTIKRPFGVKYNPYTRSI 



446 



wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Grst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last anuDO 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E^Glutamic Acid, F=Phenylalamne, (^Glycine, H=EDstidine, 
l=Isoleucine, K=Lysine, L/=Leudne, MHVIetbionine, 
N=Asparagine, P=Proline, Q=Glutainine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophaii, Y=Tyrosine, 
X=lInknown, *=Stop codon, /=possible nudeotide deletion, 
V=possible nudeotide insertion 










QELKDTKSITSAMNELQHDLDVVSDALAKVSRKP 
SI 


3839 


A 


3093 


520 


MVNITVDQIRAIMDKKANIR>nviSVIAHVDHGKS 

TLTDSLVCKAGUASARAGETRFTDTRKDEQERCI 

TIKSTAISLFYELSENDLNFIKQSKDGAGFLINLID 

SPGHVDFSSEVTAALRVTDGALVWDCVSGVCV 

QIETVLRQAIAERIKPXHLMMNKMDRALLELQLE 

PEELYQTFQRIVENVNVUSTYGEGESGPMGNIMI 

DPVLGTVGFGSGLHGWAFTLKQFAEMYVAKFA 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 

IMNFKKEETAKLIEKLDIKLDSEDKDBCEGKPLLK 

AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGKSCDPKGPLMMYISKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRIMGPNYTPG 

BaCEDLYLKPIQRTILMMGRYVEPIEDVPCGNrVG 

LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 

VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 

rEESGEHIlAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEEShT/LCLSKSPNKHNRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL 

NEIKIJSVVAGFQWATKEGALCEENMRGVRFDV 

HDVTLHADAIHRGGGQIIPTARRCLYASVLTAQP 

RLMEPIYLVEIQCPEQVVGGIYGVLNRKRGHVFE 

ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 

GQAFPQCVFDHWQILPGDPFDNSSRPSQWAETR 

KRKGLKEGJPALDNFLDKL 


3840 


A 


2 


753 


SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 

SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 

bLCKACi 1 VbNKJiAVTSMGGKSSCPVCGlSYSFE 

HLQANQHLANIVERLKEVKLSPDNGKKRDLCDH 

HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 

TEEVFKECQEKLQAVLKRLKKEEEEAEKLEADIR 

EEKTSWKYQVQTERQRIQTEFDQLRSELNNEEQR 


3841 


A 


2 


405 


GKAFSCFTYLSQHRRTHMAEKPYECKTCKKAFS 
HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 
JLJKrLtKJhl I (jKK.o i UOJvAr 1 KoKri-KljrlbK 1 
HTGEKMHECKECGKALSSLSSLHRHKRTHWRDT 
L 


-3842 


A 


311 


88 


AVLKNMAPMTALGLLDLHILNLILFLSAGEDFTS 
WSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 


3843 


A 


3 


1175 


APIRNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKKSVEEGKIDGIID 

KTnGDFQKEQKKFVEEQHTBCKSEAAVPPWVDT 

DQMYPVALVMLQEDELLSKMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPWIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Hrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glatamic Acid, F=PbenylaIaninc, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^'Leucine, M=Methionine, 
N=Asp4ragine, P=Proline, Q^lutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptopban, Y=Tyrosine, 
X'^Unknovrn, *=Stop codon, possible nucleotide deletion, 
V=possible nucleotide insertion 










VTESEKRDENWDKEIEKMLQEEN 


3844 


A 


798 


148 


LPPAQIPEAWLLLANWVVLILVPLKDRLIDPLLL 

RCKLLPSALQKMALGMFFGFTSVIVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 

LIGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 


3845 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTNMLQRKRLQTLMSVDDS 

MBTTOJMLVETGELDNTYIVYTADHGYfflGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPmDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHTOHEmTLQNKIKNLREVRGHLKKK 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3846 


A 


3 


1934 


PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPEHMEFTNMLQRKRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDlAGLDIPAiDMDGKSILKLLDTERP 

y>nU=HLKKKJVIRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHTOHEffiTLQNKIKNLREVRGHLKKK 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEBCDKVWLLREQKRKKKLRKLLKRLQNNDTCS 

MPGLTCFTHDNQHWQTAPFAVTLGPFCACTSAN 

NmWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMBLRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 


3847 


A 


1 


1257 


MVFSAVLTAFHTGTSNTTFWYENTYMNITLPPP 
FQHPDLSPLLRYSFETMAPTGLSSLTVNSTAVPTT 
PAAFKSLNLPLQITLSAIMIFILFVSFLGNLVVCLM 
VYQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTJLTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 
nSEDRFLIIVQRQDKLNPYRAKVLLWSWATSFCV 
AFPLAVGNPDLQIPSRAPQCVFGYTTNPGYQAYV 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A==Alanine C=Cysteine, D=Aspartic Acid, 
£'=GIutamic Acid, F=Phenylalanine, G^lycine,'H=Histidine, 
I=Isoleucine, K=Lysine, )U=Leucine, M^Metbionine, 
N=Asparagine, P=ProIine, Q=Glutaniine, R=Arginjnc, S^Serine, 
T=Tlireonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^=possible nncleotide deletian, 
V^ossible nucleotide insertion 










Tt TQT TQP17TPI7I A/IT VQT7TV>fr^TT XTTT T?tTMAT DTTJCvm-c 

GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFrT 
ILILFAVFIVCWAPFTTYSLVATFSKHFYYQHNFF 
EISTWLLWLCYLKSALOTLIYYWRIKKFHDACLD 
MMPKSFiafLPQU>GHTKRRIRPSAVYVCGEHRT 

w 


3848 


A 


3 


2827 


SSAVAARRRRSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKULTARPFRLDLLEDRSLLLSVNARG 

LLEFBHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVDISSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDEEHADGKRYFT 

WDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVD 

SGYRVHEELRNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRAWWANMFSYDNYEGSAPNLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPFWYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVQV 

VI "P/^oriP\A\/vr^Tr\cv/^viJum3r^TT vr "da/tt octt> 
jlljrvjV^Ollv W I JJiys I (^JsJintjJry llv I LrV iLSsoli' 

VPQRGGTIVPRWMRVRRSSECMKDDPITLFVALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERWIIGAGKPAAVV 

LQTKGSPESRLSFQHDPETSVLVLRKPGINVASD 

WSIHLR 


3849 


A 


1 


1717 


rarnargcwgvcrsgfssavcgaarmeqvaeg 

arvtavpvsaadsteelaeveegvgwgednda 

aargaeafgdseedgedvfevekildmkteggk 

vlykvrwkgytsdddtwepeihledckevllef 

rkkiaenkakavrkdiqrlslnndifeansdsdq 

qsetkedtspkkkkkklrqreekspddlkkkka 

kagbclkdkskpdlessleslvfdlrtkkriseak 

eelkeskbcpkkdevketkelkkvkkgeirdlkt 

ktredpkenrktkkekfvesqvesessvlndspf 

peddseglhsdsreekqntksareragqdmgle 

hgfekpldsamsaeedtdvrgrrkkktprkaed 

trenrklenknaflekktvpkkqrnqdrsksaa 

eleklmpvsaqtpkgrrlsgeerglwstdsaee 

dketkia^skkpkkidevmtkelkkvkkgeird 

spfpeddseglhsdsreekqntksareragqdm 

glehgfekpldsamsaeedtdvrgrrkkktprk 

aedtrenrklenknaflekktvpkkqknqdrsk 

saaeleklmpvsaqtpkgrrlsgeerglwstds 

aeedketkrneskkpkkdevketkelkkvkkge 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc D=Aspartic Add, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, l/=Lcucine, M^^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-passiblc nucleotide insertion 










IRDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPED/RQ*RATFRQQREEKSPDDLKKKKA 
KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESBCKPK 


3850 


A 


1113 


3975 


PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RJRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVVRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRTELSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKJEGKERMYEENSQPRRNL 

TKLSLIFSHMLAELKGIFPSGLFQGDTFRrrKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDIFTRLFQPWSSLL 

R2>fWNSLAVTHPGYMAFLTYDEVKARLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALDDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIWDPFDPRG SGSLLRQG AEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

MYNIQSQAPSITESSTFGEGNLAAAHANTGPEES 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP*PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSEIENLMSQG 

YSYQDIQKALVIAQNNffiMAKMLREFVSISSPAH 

VAT 


3851 


A 


2 


2781 


GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQAVDTAVDGVFIRSLKVNC 

KVTSRFAHYWTSQWNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKRNHMQYEIVIKVKPKQLVHHFEIDV 

DIFEPQGISKLDAQASFLPKELAAQTIKKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANNHFAHFFAPQNLTNMNKNVVFV 

EDISGSMRGQKVKQTKEALLKILGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGffilLNQVQESLPELSlvIHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIWAGRIADNKQSSFKADVQA 

nvjjC/VJ v^nr oi i v jjeiicivij*wJsj_. JUiviiKu IIJ^^ rl V 

ERLWAYLTIQELLAKRMKVDREVRANLSSQALR 

MSLDYGFVTPLTSMSIRGMADQDGLKFnDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHVPQKEDTLCFNINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino add sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E^GIutamic Add, F=PhenylaIanine, G=Glydne, H=Histidine, 
I=Isoleudne, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nudeotide deletion, 
V=possible nudeotide Insertion 










LGIANPATDFQLEVTPQNITLNPGFGGPVFSWRD 

QAVLRQDGWVTINKKRNLVVSVDDGGTF\EVV\ 

LHRVW\KGSS\VHQDFLGLLMCWDKSIGMSSPGR 

KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 

MVVRNPPGLTVT\RGLQKDYSBaDPWHGAEVSC 

WFI\HNNGA*I\TDCAYTDYI\VPDIF 


3852 


A 


39 


1735 


TQVAEAGRGEGWAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRIVIW\DF\LTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD 

SKHVVLPVDDDSDLNWASFDRRGEYIYTGNAK 

GKILVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRIIRVYDGREILTCGRDGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYTVAGSARQH 

ALYIWEKSIGNLVKILHGTRGELLLDVAWHPVRP 

IIASISSGWSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDffiDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 


3853 


A 


45 


2603 


PLLFTCGREVRARDPEICEGTIVVAGLKVQVQPRF 

LWILCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEKNVCKIYLSQLQTGEKSKN 

TIHEDTIFRMGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYLI 

QHLKIHTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQRIHSGVK 

PYECKECGKAFSRVRDLRVHQTIHAGERPYECK 

ECGKAFRLHYQLTEHQRIHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQKIHTGEKPYECKECGKSFSFHAELARH 

RRIHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICQYQLTLHLRTHTGEIPYEG 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHHRJHTCEKPYECKECGKAFfflSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHRIHTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHIVHTG 

EKPYKCKECGKAFSVNSELTRHHRIHTGEKPYQC 

KECGKAFIRSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL*QRICNLRNFLFVTEHVGIPFTSCSQn 

RNYFVC 


3854 


A 


108 


894 


LQSCWVPGBPWPSVGWLSWLKDLPSCEIHSASLS 

AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 

DNLSTDDINTSSSISSYANTPASSRKNLDVQTDAE 

KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 

KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 

GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locatioo 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glyc'ine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M^Methionine, 
N=Asparagine, P=ProUne, Q=Clutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=lInknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nncleotide insertion 










S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 
TP 


3855 


A 


1 


772 


FRGGDGAPGVLKPGNPLPFPLPPLQYPPPSTLSHS 

DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 

VGKSSLVLRFVKGQFHEYQESTIGAAFLTQSVCL 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQETFARAKTWVKELQRQASP\SIWGL 

AGNKADLAMKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 


3856 


A 


2815 


352 


LGLEAAARPRPGGPAAMQDGlSfFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSKRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNLIYDNADNKLALVEENGIFELLRTLREQDDE 

LRKNVTGE.WNLSSSDHLKDRLAKKTPLE\QLraD 

LGV*APLSGAGGPP\LIQQNASEAEIFYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSREULRELPLA 

ADALTFAEVSKDPKGLEWLWSPQIVGLYNRLLQ 

RCELNRHTTEAAAGALQNITGG\DPRGPGGLSRL 

ALEQERILNPLLDRVRTADHHQLRSLTGLIRmS 

RNAKNKDEMSTKW\SHLI\EKLPGSVGEKSPPAE 

VLV\NI\IAVFNNLGWLASPI/ALARDLLYFDGLRK 

LIFKKKRDSPDSEKSSRAASSLLANLWQYNKLH 

RDFRAKGYRKEDFLGP 


3857 


A 


1034 


204 


VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LKPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKKKTKDLGFRAGKESKTEWRK*GLQDMASQ 

MFALPLK*PVTAAFHDSSMPSSLLQIEMEQLFLE 

ARLQ/PDSKSEARRNQCDSMLLKNQQLCSTCQE 

MKMVQPRTMKIPDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 


3858 


A 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLRNDSNFALQTMEPALPMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKKWQIYCSKKK 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKHESLKTALRTKPMRFVTRFIDLDGLS 

CZLhn^KTA/TOYETSESRIHTSLIGCIKAlJVlNNSQG 

RAHVLAHSESINVIAQSLSTENIKTKVAVLEILGA 

VCLVPGGHKKVLQAMLHYQKYASERTRFQTLIN 



452 



wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginDiDg 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£==Glatamic Acid, F=Phenylalanine, G=Glycine, U^Histidine, 
I^Isoleucine, K=Lysine, L=Leuclne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=TIireonine, V=ValiDe, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
V=possible nucleotide insertion 










DLDKSTGRYRDEVSLKTAIMSFINAVLSQGAGVE 

SLDFRLHLRYE\FLMLGIHPWIDKLRKHENSTLD 

RHLDFFEMLRNEDELEFAKRFELVHIDTKSATQM 

FELTRKRLTHSEAYPHFMSDLHHCLQMPYKRSGN 

TVQYWLLLDRnQQIVIQNDKGQDPDSTPLENFNI 

KNVVRMLVNENEVKQWKEQAEKMRKEHNELQ 

QKLEKKERECDAKTQEKEEMMQTLNKMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFKILDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDH.SSKLKVKELSVIDGRRAQNCNILLS 

RLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFV 

PEKSDroLLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEAIliSGSEE 

VFRSGALKQLLEWLAFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLinVENKYPSV 

LNLNEELRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQFITVASFSFS 

DVEDLLAEAKDLFTKAVKHFGEEAGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 


3859 


A 


1279 


141 


RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLWSEETYRGGMAINRFRLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 

ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLGAF 

VTOSDHLGHRAYAPGGPAYQPWEAFGTDILHK 

DGIIMlKVLGSRWGNKKQLKILTDIMWPnAJKlA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVIPETEAVRRIVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHVVLS'nCGSRISPNARWRKPGPS 

CRSAFPRLIRPSTEKFSVGPDWLLELTSDPWRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 


3860 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAEFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAWFGTWDIISRS 

GEKIPVSVWMKRMRQERRLCCWVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTfflGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

OUJrlV VrKiJblKKXiMbSQDlFTGTQTELIAGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 
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PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to iir^t amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=?Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PbenylaIanine, G=Glycinc, H=Histidine, 
I^Isoleucinc, K=Lysine, I^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=l]nknown, *=Stop codon, A^possible nucleotide deletion, 
\=^05Sible nucleotide insertion 










EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVrVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVWKFIKKEKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

"HTTriQA AVT VQf^Vl T?VT17r*riTn7Vr'AT>P\/T XyfnXrDV 
J-^rOo/VA iLJlJvOISJLr I lrL-.01lJDii-^AJrEVJLjiVlOlNr i 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3861 


A 


1 


3881 


MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAWFGTWDIISRS 

GEKIPVSVWMKRMRQBRRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINWLA 

GGHWPRDEIRKLMESQDIFTGTQTELUGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVTVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVELEGLAACEGEYSOKYSTM'iPLGSGAFGFVW 

TAVDKEKNKEWVKFIBCKEKVLEDCWffiDPKLG 

KVTLEIAILSRVEHANDKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAGXQ 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 
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wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide . 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of. 
peptide 
sequence 


Amino acid sequence (A=AIanine C=C>'steine, D=Aspartic Acid, 
E=Glutaniic Add, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucijie, M^Methionine, 
N=Asparagine, P=Proline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V^ossible nucleotide insertion 










RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 


3862 


A 


399 


2069 


TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

WGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSKLQE 

EEQERDRJKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

KPKUOAAR 1 PK VlMr> SARQDLMGGKJKFFDLPSK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 


3863 


A 


399 


2069 


TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 
SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 
FFSEVFKVRHRASGQVMALKMNTLSSNRANML 
KEVQLMNRLSHPNILRYINSGNLEQLLDSlvfLHLP 
WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 
LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 
VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 
ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 
FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 
EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 
HKSPCPRRHWLSRSQSDIFSRKPPRTVSVLDPYY 
RPRDGAARTPKVNPFSARQDLMQGKIKFFDLPSK 
SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 
"SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 
YRVKEEPPFRASALPAAQAHEAMDCSILQEENGF 
GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 
GIGLQTQGKQDG 


3864 


A 


3 


911 


SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

KAl>J'6lNr JLLJJKlsJv 1 lJJLLi<JsJKJilsJ<JKJ<J<JUoUArOK 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMJECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 


3865 . 


A 


3 


3573 


QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

FT TTA'RP'RnPUQTWrPVQT PTiQriTT'PTQPPATr* A HTA C 

ri-* 1 i/vinj\jc\.ojsjvoivJx V oJ-rfJi i oUx A lU/\Ji lAo 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid., F=Phenylalanine, G=Glytine, H==Histidine, 
Wsoleucine, K=Lysin£, L^Leucine, M-Methionioe, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=tyrosine, 
X==Unknown, *==Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCTILQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTIEQKSSEDQGIKGRIEKAANPSGKK ' 

KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

NDCILKHAAATMKFLSSGKEQKPKPKEKMmK 

PEKPSLPKCGAQAGIKISSVHKRPAPEKKETTVK 

KAVVVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAEKKPPSGFKGTIPKRPWLSATPSSGASAARQAG 

PAPAAATAASKKFPGSAALVGAVRKPWPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RFLFFILFRVNDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 

LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHRAHLFDLNCKICTGQVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTMGGRIAPKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGVVA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 


3866 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLWIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

G\LPFnFQGG]VIPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWPBDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PbUl^LALbOPLbRVKSLKKSLRQSFRXMRRbRVS 

SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 
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wo 01/57190 



PCT/USO 1/04098 



SEQID 
NO: 

- 


Method 


Predicted 

beginning 

aucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glntamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I^lsoleucioe, K=Lysine, I^Leucine, lVI=iVlethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\=passible nucleotide insertion 










I^ivi^JSJ^ 1 AL«ii\JoK V ivK V o V Arlr OoKKAJiD Y vjxirLrl 

LAVLTNLGDIQWSLPLLKPQVRYSCIRREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKGNLVEPRC 
L\^SAETKNHRPGNGAGPKKAPSRARNSGTQSD - 
GEEKQPGLVMERALLSDERAATGWHIEPPWGA 
ASAMAEQSEWLSVQAAR 


3867 


A 


2 


3181 


AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 
QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 
FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAK 
LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 
LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 
GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 
LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 
EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 
HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 
VSSEAQQPEPLRSLVPYGPFPCKAITRJDLWLTTRQ 
G\LPFTIFQGGiVIPRASYGDRHCISVIHDGQQTAFD 
FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 
WIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 
BPLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 
APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 
' STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 
DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 
LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 
RLAARSGPVRFEPGFQPFVLVQCQPPAWTSLAL 
HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 
PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 
SRKRHPAGPPGEAQEGSAKAERPGLQNMELAPV 
QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 
PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 
QAKEIQLMHRAPWGILVLDGHSVPLPEPLEVAH 
DLSKSPDMQGSHQLLWSEEQFKVFTLPKVSAK 
Llso^lsJ^ 1 ALbuaKVRKVbVAHr OSRRAEDYGEHH 
LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 
ASCVFTKYGQGFYLISPSEFERFSLSTKG\LVEPRC 
LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 
GEEKQPGLVMERALLSDERAATGWHIEPPWGA 
ASAMAEQSEWLSVQAAR 


3868 


A 


1 


2497 


GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVYARVTRLRDWILEATTKASMPLAPTMAPAPA 

APSTAWPTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRWGGFGAASGEVPW 

QVSLKEGSRHFCGATWGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVKIGLRRWLHP 

LYNPGILDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GIIDQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KLRAELDEVNKSAKKREGELTVAQGRVKDLESL 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAMAGERDKFRKMLD 
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wo 01/57190 



PCTAJSOl/04098 



SEQH) 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to Tirst amino 

acid residne of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E^GIutamic Acid, F=PlienyIalanine, G=Glycine, H=Histidine, 
I=lsoleucinc, K=Lysine, I^Leucine, M^Methiooine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Af5inine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible DDcleotide.deletion, 
V=possible nucleotide Insertion 










AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKRKR\WRWRSPW\QRPKRPG 

QLKNNSDKDQSLGNWRJKRQVLEGEEIAYKFTP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 


3869 


A 


1 


1942 


RYRAGIPGDGRKDYIRLTRPGLTLPGRAMFARGS 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRKRENLG/RLG 

IVRIFPVnT\GAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRGAPHRHSMPEL 

OLKaVPbrPrbaFuQPNLRPDDSAr QRQSTPRVSF 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

rvA 


3870 


A 


2 


3485 


FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGffRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSVPPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTirVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RRNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLroiMDTSQQKSAGLLMVHTVDATKLDNA 

LQEEDEBCAEVEMKPDSSPSEVPEGVSETEGALQI 

SVDLDEDFlFi'EPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTNSADSKKPAS 

LSNCLPASFLPPPESFDAVADSGIEEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ 

AFMVDKPPVPPKPKMKPIIHKSNALYQDALVEE 
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wo 01/57190 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Add, 
E=Glutamic Acid, F=PbenylaIanine, G=Glycine, H=Histidine, 
I^Isolendne, K=Lysine, l^Leudne, M=Methionii)e, 
N=Asparagine, P=Proline, Q=GlDtamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nudeotide deletion, 
\F=possible nudeotide insertion 










DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMSTISGTRST 

PWSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHLPNLQKEDLIDLGVTRVGHRMNIERALKQ 

LLDR 


3871 


A 


35 


1171 


VESRSAWHEGEDQIDRLDnRNQMNLLTLDVKK 

KKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEDENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

li^AiJriJlNAljyjtJfcLJVLIlJLV J oJLA^V J oRi oMOillV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 
HAKERAFKQQFVNYATEKLRMWSSTSANCSHQ 
VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 
QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 
NEES 


3872 


A 


35 


1171 


VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

BCIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHIEDGMGRNLADRCTD 

EVNALVLQTQQEHENLKPLLPAGIQDKLHTLIPC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

IrAlrL'iNAol^E.tLjyUl J-V 1 uLAoV 1 oKl oMOlllV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 
HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 
VKQQIATTFARLCQQVDITQKQLEEEIARLPKEID 
QLEKIQNNSKIXKNKA VQLENELENFTKQFLPSS 
NEES 


3873 


A 


2944 


2089 

- 


PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 
QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 
KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 
oooAJLr WINll Vilr Js.arrllKX'LL/nCroKJJAIJlAnr 
MSCMKEADALKHKSQVINEMQKKDHKQLAVMG 
LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 
, QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 
SAIDPEDGEKKNQVMIHGffiPMLETPLQWLSEHL 


3874 


A 


776 


366 


QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 

DEARKIGWGWVKNTSKGTVTGQVQGPEDKVN 

SMKSWLSKVGSPSSRffiRTNFSNEKTISKLEYSNF 

SIRY 


3875 


A 


1081 


182 


SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

OODPEVPKSLVSNLRIHCPLLAGSAI ITFDDPKVA 

EQVLQQKEHTINMEECRLRVQVQPLELPMVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

locfltion 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
coiTcs po D d 1 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^-Histidine, 
I^^Isoleucine, K=Lysine, L^Leucine, M=Methionine, 

rtjiitti dgiiicj K r 1 ulluc) \^-*uiuuiiiiiiiCf XV — ArgininCj o^s^cnDC, 
T=Threoninc, V=VaIine, W=Tryptophan, y=Tyrosine, 
X=llnknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 










RGGGEVEALTWPQGQQGLAVFTSESG 


3876 


A 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPLSL 
VVHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGBEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 


3877 


A 


3 


1291 


KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCmPSMGLNEEQKEFQKV 

AFDFAAREMAPmiAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKFASYCLTEPGSGSDAASLLTSAKKQGDHYI 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCrV 

VFKfiTPfii <;prii<rifFk'ifVfiWM<!nPTR AVTPPnr'A 

V i-fXVVJ 1 VJl_«OX^ VJPkJVCfXS-iV V VJ WIN iJV^Jt Itvrt. V JXCJ-'V^A 

VPVANRIGSEGQGFLIAVRGLNGGRINIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQILEGSNEVMRILISRSLLQE 


3878 

- 


A 


10 


1014 


LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 
SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 
TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 
PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKPSA 

TESTTSWPRFS.SWTSGFnPASPAPAI 


3879 


A 


200 


699 


LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 
NTSLCTTIDYKTTOVT FPI I YTVT FFVrJT ITTviriT A 

MRIFFQIRSKSNFIIFLKNTVISDLLMILTFPFKILS 

DAKLGTGPLRTFVCQVTSVIFYFTMYISISFLGLIT 

IDRYQKTTRPFKTSNPKNLLGAKILK 


3880 


A 


26 


169 


QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 


3881 


A 


37 


,1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 
RIMPTTVDDVLEHGGEFHFFQKQMFFLLALLSAT 
FAPIYVGIYFLGFTPDHRCRSPGVAELSLRCGWSP 
AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 
FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 
rVTFFNT VCANSWMT DT Fr)<?<JV>Jvr!FFTr;<Jivf'!rr; 

YIADRFGRKLCLLTTVLINAAAGVLMAISPTYTW 

MLIFRLIQGLVSKAGWLIGYELITEFVGRRYRRTV 

GIFYQVAYTVGLLVLAGVAYALPHWRWLQFTV 

ALPNFFFLLYYWCIPESPRWLISQNKNAEAMRIIK 

HIAKKNGKSLPASL 


3882 


A 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSYIGP 

KRTAVVRGIMHREAFNnGRRIVQVAQAMSLTED 

VLAAALADHLPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDIEGAVRRYVQPFLNALGAA 

GNFSVDSQILYi^AMLGVNPRFDSASSSYYLDMH 

SLPHVINPVESRLGSSAASLYPVLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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wo 01/57190 



PCT/USOl/04098 



SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Gliitamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l^lsoleucine, K=Lysine, L=Lencine, M-Methionine, 
N=Asparagine, P=Proline, Q^lutamine, R=Arginine, S=Serine, 
l=Inreonine, V=Valine, w=Tryptopnan, Y=Tyrosine, 
X=Unknown, *=Stop codon, ^possible nucleotide deletion, 
^possible nucleotide insertion 










AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 


3883 


A 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDILVPILFFLNDARADQSRVGLM 

fflGVFILLLLSGECNFGVRLNKPYSIRVPMDIPVF 

TGTHADLLIVWFHKIITSGHQRLQPLFDCLLTIW 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVFNNIIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPTIHKALQRRRRTPEPLSRTGS 

QGGAPPWRAPAPLPLQSQAPSRPVWWLLQALTS 

*PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICEDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMV\^GVm.RNVDPPVWYDTDVkLFEIQRV 


3884 


A 


1 


804 


NGPRAPFSQEGQSTGPPPLEPRLGQHGAQGRIPPL 

NPGQGPGPNKDDSRGPPNHHMGPMSEKRHEQSG 

GPEHGPERGPLRGGQDCRGPPDRRGPHPDFPDDF 

SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 

RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 

RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 

GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 

LRGRGRGTPRGERVTKDTWSGRIGCRIHWL 


3885 


A 


3 


996 


GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPIVINAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMDHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SWKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 


3886 


A 


773 


317 


QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
TKSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGIVGAVMAVLLLALIILIILFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLETV 
DSGTELHIQVRPEMVASTV 


3887 


A 


3 


466 


VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 
RQLLRKADGVVLMYDITSQESFAHVRYWLDCL . 
QDAGSDGVVILLLGNKMDCEEERQVSVEAGQQL 
AQELGVYFGECSAALGHNILEPWNLARSLRMQ 
EEGLKDSLVKVAPKRPPKRFGCCS 


3888 


A 


3412 


3144 


QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 




A 

A 


1 


1 loU 


LVVTAITAILAFPNEYtRMSTSELISELFNDCGLL 

DSSKLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKTVI 1 in FGMKIPSGLFIPSMAVGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLWIMFEL 

TGGLEYIVPLMAAAMTSKWVADALGREGIYDA 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or^ 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=AIanioe C=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, F^Pbenylalanine, G=Glycine, H=Histidine, 
I^Isoleacine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, (}=Glutamine, R=Arginine, S^erine, 
T=Threonine, V=Valine, W=Tryptoplian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possibIe nodeotide insertion 










LLTVLTQDSMTVEDVETnSETTYSGFPVWSRES 
QRLVGFVLRRDLnSIENARKKQDGVVSTSnYFTE 
HSPPLPPYTPPTLKLRNILDLSPFTVTDLTPMEIW 

DTFRlfT ni ROOl VTHMrjPT J nTTTIf I^nVT VHTAr* 
uir I\£%J^\JLtI\\^\^Li V 1 XllNVjr^X/JLrVJJUL i rsJ\XJ V l_»JSJllAw 

MANQDPDSILFN 


3890 ' 


A 


1 


387 


SWCWTGffVLGTTNLRLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
RVPYTKLQLKELENEYAINKFINKDKRRRISAAT 
NLSERQVTIWFQNRRVKDKKIVSKLKDTVS 


3891 


A 


2 


2914 


RQGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPTVVRA.AELEQVPHIALFLFK 

KTRLSITICFFSICFLLPYCGLDTLADQNXNQVRKT 

SQAALL\ALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAPXMVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSWGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NENDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITMATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDWPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRIGVLKH 

LHDFLKLLHIDKRREYLYQLQEFLVTDNSRNWR 

rKAbLAtQLU^LLbLYSPRDVYDYLIU'IALNLCAD 

KVSSVRWISYKLVSEMVKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTKISEDAMSTASSTY 


3892 


A 


158 


2191 


VPLPAPSGLSGGGSRGAGCKKAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPLAFRDATSAPLRKLSVDLIKTYKHINEV 

YYAKKKRRAQQAPPQDSSNKKEKKVLNHGYDD 

DNHDYIVRSGERWLERYEEDSLIGKGSFGQWKA 

YDHQTQELVAIKIIKNKKAFLNQAQIELRLLELM 

NQHDTEMKYYIVHLKRHFMFRN\HLCLVFELLS 

YNLYDLLRNTHFRGVSLNLTRKLAQQLCTALLF 

LATPELSIIHCDLKPENILLCNPKRSAIKIVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

VEVLGEPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end . 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
EKilutamic Acid, F^Pbenylalaninc, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Metbionine, 
N=Asparagine, P=Proline, Q=Glntamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=UnknowD, *=Stop codon, A=possible nucleotide deletion, 
V=possible nucleotide insertion 










GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 
OAPA'^AS*!! PfiTfiAOI PPDPHYT fiT?PP<5T>T<;PPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 

S 


3893 


A 


68 


258 


PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 


3894 


A 


1120 


136 


SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 
AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 
PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 
QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 

A A P A PrSnP A P A ThT^PT rJVPP\/P a \Tl/T T P'nCPPT X>nT 

HSGPPPAAVSLPPAAAACPWVPPPLPHHPPDLES 
PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 
LLPLPRPPS*P/VPWKPLHSPVAVAGGSFVAGGSV 
LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 
PTVP*APPVSGAAPGRKREW 


3895 


A 


2 


1347 


FGAVSYRPGNGSCWVKVTASSDLSDLISCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEBETQKRRAKESGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVKELAPQQEGNP/ARSIPHSDIGT 

T*KT*H*RVLLQGNQEKNTRL»LSVER**KKLQQ 

SDYGPKRKSYL*ERPTR*KRYRKQVY*TSA\*LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

POTTQCxni^Ttrv/CMV'vriTTMjrr'QCT t Tr\urwQr^jj>i: 
rKlLloo i J-JS. 1 rl V oIN Y vj 1 JJr IL/aoL'L' i ^^il^s^ivodKil 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 
DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 

NSCLALHQKTmGEKPYTCKECGQAFSVRSTLTO 
HQVmSDK 


3896 


A 


202 


498 


MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLC 
NNKLLKENAVPTIFLCTEPHDKKEDLLEPQEQ 


3897 


A 


2 


382 


SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMQHFILLFSRQGKLRLQKWYITLPDKER 

SLYFCCAIEVNQDNELLTLENVHR 


3898 


A 


718 


305 


SEQEPLLGDTPGSREWDELETEEHYKSRWRSnUL 
YLTMFLSSVGFSWMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LISVAANCLYAYLHIPASHNKYYMLVARGLLGIG 


3899 


A 


24 


718 


FRGRPGIPEREGKGNHSFVEVARVIVVDLHSRLG 

VNAShJLEKQTSKGKYFVTFPYPYMNGRLHLGHT 
FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 
ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 

LSDEEIVKFSEAEHWLDYFNALAIQDLKRMG 


3900 


A 


360 


1 


VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPQLRATA/PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 


3901 


A 


193 


345 


GEWAVPPAPGGQGVSEPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A=Alaninc C=Cysteinc, D^^Aspartic Acid, 
£=Glntamic Acid, F=Pbenylalanine, G=GIycine, H=Histidine, 
I^Isoleucine, K=Lysine, L^Leucine, M=Metbionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, \V=Tryptophan, V=Tyrosine, 
X=UnknowB, *=Stop codon, A=^ossible nucleotide deletion, 
V=possible nucleotide insertion 


3902 


A 


1188 


1389 


NPAARSAAAREGSPALPPPPVS/SSSGLOT T T PT "^P 
PGSHAANPALSPRAPHSHYRPRPRCGPRRRPR 


3903 


A 


63 


396 


NNMRNPHLSSNHYLNLARTETVFARMESVKQRI 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGETIE 
LTEDGKPT *VPFRKAPT fDrTPFfil PHT?VTTAT\/rQ 

GLGFCISFG 


3904 


A 


732 


1046 


AMSECPLBLYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREVVFGKSEDEHYPLW*VLFGK*YA 
VAPNALMFIRFM*NCTFVPKLP*VMDLK**LQYK 
SR 


3905 


A 


46 


910 


QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPCSTMPGMMEKGPELLGKNRSANGSAKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

SPYHSEPDAKLDEYIAIAKEKHGYNVEQALGMLF 
WHKHNIEKSLADLPNFTPFPDEWTVEDKVLFEQ 
AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 
TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 


3906 


A 


2 


513 


KVCNCCSQELETSFTYVDKNINLEQRNRSSPSAK 
GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSIDYGnSAILFLVTGILLVnSYIVPREV 

TVriP>JTV A A R PA;rRP T "Plt^RQ A P T n A "UT r\p r^\/T A n 
1 V i_/riN 1 V /\/vivE,ivijiisjL/llisjDo/\i\^ V LAO 

LCLLTLGGVILSCLLMMSMWKGELYRRNRFAS 


3907 


A 


71 


412 


ILIMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRWITGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 


3908 


A 


77 


746 


LGTLLGWRAPLFSRCLAFHSPFILLNTPKLVKTAE 

1 PPDRNYVT OAHPFinrMr'TriFT PMFQTFQMnPQO 

LFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVS 

RQSLDFILSQPQLGQAVVIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 


3909 


A 


1 


793. 


FRAAGRPAAAMGDIPVVGLSSWKASPGKVTEAV 
KEAIDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

ALKLNYLDLYLIHWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVITGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDLIDNPVKRIAKEHGKSPAQILI 


3910 


A 


202 


705 


FFTMHRTticvn^JPTpn rFTwriVAFPOP^T pwvrsr* 

rr X ivixisvTv^ V LylNJ^JJ^JJJU3lNU^ V fvDlV Y V VOU 

RGKDQWILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRMRQLQKKIKNGTLNIKQDDPFELFI 

AATOIRYCYY>ffiTHKILGNTPGMCVLQDFEALTP 

NLLARTVETVEGGGLVVILLRTMNSLKQLYTVT 

M 


3911 


A 


3 


723 


AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPArVQNTTFGKYEKTHVCNLKKFKVFGGMN 

EE>MTELLSSGLK1TOYNKETFTLKHKIDEQMFPC 

RFDOVPLLSWGPSFNFSIWYVELSGIDDPDJVQPC 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide' 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
EMilutamic Acid, F=Phenylalanine, G=<Jlycine, H^Histidine, 
I=Isoleucine, K=Lysine, I>=Lencine, M^Methionine, 
N-^paragine, P=Proline, Q=Glatamine, R=Argimne, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkjiown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










LNWYSKYREQEAIRLCLKHFRQHNYTEAFESLQ 
KKT 


3912 


A 


2 


461 


FEKKQLRRPSLFLLGCCSFGIMAPSLWKGLEGIG 
LFALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 
QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 

LSSNTSLKLRKLESLRR 


3913 


A 


362 


20 


APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGKMRKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISrWDTAGEAGAA 


3914 


A 


I 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLQKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKWETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEJCLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMfflQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTWPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVElNnEMTKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHWGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

lEADEGLnGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANrVNSVVTEEBCDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICS VTGAGPREERMVTGAG V 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 
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SEQID 
NO; . 


Method 


Predicted 
begiooing 
nucleotide 
locatioii 
corresponding 
to first amino 
acid residue of 
peptide 
sequence ~ 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Atid, 
E=Glutan)ic Acid, F=Plienylalanine, G=Glycine, H^Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Metliionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, &=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyn)sine, 
X=l)nknown, *=Stop codon, ^possible nocleotide deletion, 
V=possible nucleotide insertion 










MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTWEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDnTSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGWVESE 

^fERAGTVMEEKDGSG^STSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMG 

AVLQDEDRLTITRVEDLSDAAnSTSTAECMPISA 

SEDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSS VSSIRYLAA VNTGAIKADDMPPVQ . 

GTVAEHSiFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLBCMEAYVPS 

EEEKNGEELAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 


3915 


A 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC . 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSjEPSKPARRLSESLHWDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHKNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSICSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

Q'nv r^T>v QV\/i^'r\T^i>'ET7'nTr^'\m'D^rT jyrA ceo Arjcnr/^ 
oiiKv^KKoK y tJL'Krr tt 1 Ij VJbrVLt. I AoDoArlaTlj 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDKKKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDroSEN 
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SEQDQ 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

seqaence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^lntamic Acid, F^Pbenylalanine, G<=Glycine, H^Histidine, 
I=Isoleucine, K<=Lysine, I;=Leucine, M^^Methionine, 
N='Asparagine, P=Proline, Q=Glutamine, R-=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=XJnknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 










MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVE^^5MTKEGGLVDMAKKENDLNAEPNL 

KQTIKATVENGKKDGIAVDHWGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

ffiADEGLUGTHSRNNPLHVGAEASECTVFAAAEE 

GGAWTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDWTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEEI^TGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVWESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SIDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

T?rM "CCI T5V'TCCT7TTvTOnrTCT>\rNjrccVTM7VCCCT?T*r'/^t? 
KUL/imijr'KlooilllNol 1 oKVJVltlllKlJil X ooblll lub 

KPEQNDDDTIKSQE 


3916 


A 


2 


773 


GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQLDVLEALFAKTRYPDIFMREEVALKINLPE 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue of 
peptide 
sequence 


Amino acid sequence (A'^AIanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glydne, H^Histidine, 
I=Isoleudne, K=Lysine, L=Leudne, M^Methioninc, 
N=Asparagine, P=FroIine, Q^GIutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unkuown, *=Stop codon, ^=possible nndeotide deletion, 
V^possible nndeotide insertion 










SRVQVWFKNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLWAKLPCPLHIFSLCVFIEENRLV 
SGSWARDIRSVEETDKSGYR 


3917 


A 


2 


776 


RNIPGRRFRPPGLRRLLKGPHMPREPRGYRTRVP 
ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

PSRAGASVMALRKELLKSIWYAFTALDVEKSGK 
VSKSQLRVLSHNLYTVLHIPHDPVALEEHFRDDD 
DGPVSSQGYMPYLNKYILDKVEEGAFVKEHFDE 

FNFLSEDKYPLIMDPDEGEYLLKRYS 


3918 


A 


10 


318 


WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 

RRLASLRTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 


3919 


A 


1 


204 


RVLTAINHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNMHEENLKTKKQHRKERLYPLRKYA 
AKA 


3920 


A 


1 


654 


RCCRSFVAPLQEKWFGLFFLGAILCLSFSWLFHT 

V Y t-lisbO V bKJLr aJvLD Y sCjIALLIMGSF VPWL YY 

SFYCNPQPCFIYLIVICVLGIAAIIVSQWDMFATPQ 

YRGVRAGVFLGLGLSGIIPTLHYVISEGFLKAATI 

GQIGWLMLMASLYITGAALYAARIPERFFPGKCD 

IWFHSHQLFHIFWAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 


3921 


A 


1587 


452 


LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPVAAPSRTPAPPHTRARASPGLPSG 

PAWRRVQWFSRVSGQVSTLMKATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYNILDNSKnSEECRKELTALLHHYYPIEID 

PHRT\^KLPHM\^WWTKAHNLLCQQKIQKFQI 

Ar^VArcPQivTAAAT pnnwTciTxm vxtkixitdt ctcc a 
AV^V VKJDOINiuyijjiNJIO X 1 U YrlJNiNlrLrlroA 

GIGDELEEIIRQMKVFHPNIHIVSNYMDFNEDGFL 

l^Orivo*s^lwxrl 1 1 JN rwIN ooAL^xiN\-'Vj I r s^L.Jc.LjJv 1 in V 

ILLGDSIGDLTMADGVPGVQNILKIGFLNDKVEE 
RRERYMDSYDIVLEKDETLDVVNGLLQHILCQG 
VQLEMQGP 


3922 


A 


2 


164 


GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSILHTSYGRERPAPVHLRQDT 


3923 


A 


2 


3258 


EHATHAYAKLGTRRRHREVTVFVPTWQLKKNR 

RVRESHFLTKLHSLKMLSITPSQLENGKKITTYD 

YRFMVKLAEETDGHVTNEQIHILMNSSKKLMVK 

DRLLPFTFAGNLEMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVWKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDIDLLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLTIPSNFTALSFFMGFMDSHRDAIPDYEALVG 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPWLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGEl«fRL 

LTPAASMPRFFQVLPPFSDLSTFVCEHMSGYCFYR 
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SEQID 
NO: 


Metbod 


Predicted 

beginning 

nudeotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E-Glutaroic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
l-koleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=TryptophaD, Y=Tyrosine, 
X=lJnkaown, *=Stop codon, /^possible nacieotide deletion, 
V=possible nucleotide insertion 


- 








EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPWFLTHCNWIFSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYnSLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWWPTQLRRDLIFSVHDIPLGAHQR 

PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQffiWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 

VLTGGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 

SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 

KVLEQ 


3924 


A 


1 


1826 


MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANKIDDVIDSRVEDPEEGHLKFSSELGMIF 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDLPAASWICFYNEAFSALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVmnSADTLAYSSSPWRGGFNWGLHFKWDLV 

PLSELGRAEGATAPKSPTMAGGLFAMNRQYFH 

ELGQYDSGMDIWGGENLEISFRIWMCGGKLFIIP 

CSRVGfflFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPBFVNR 

GPKRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLWLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 


3925 


A 


5386 


2897 


vrwnsktecylsiqtqenfpanlnelvncrvissl 
vttqrklkamsllgsrnqlaravlnpnpmdfct 
kdlltttseriiaylrdfnedqkkaietayamvk 
hspsvakiclfflgppgtgkskttvgllyrlltenq 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

BCKIBLEFKEKCKDKKNPLGNCGDINLVRLGPEKSI 

NSEVLKFSLDSQVNHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSniLESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMAKFCRLLEENVEHNMISRLPILQLTVQ 

YRMHPDICLFPSNYVYNRNLKTNRQTEAIRCSSD 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EIIKLIKDKRKDVSFRNIGnTHYKA.QKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV 
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SEQ ID 
NO: 


Method 


Predicted 

beginning 

nncleotide 

locatioD 

corresponding 

to flrst amino 

add residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H^Histidine, 
I-Isoleucine, K=Lysine, I/=Leucine, MfMetbionine, 
N=Ajparagine, P=Proline, Q=Glutamioe, K=Arginine, S=SeriDe, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
V°^ossible nucleotide insertion 










KILKLKPVLQRSLTHPPTIAPEGSRPQGGLPSSKL 

JJour Ais. 1 o V AAoL* i rl 1 r oUalsJil 1 L 1 V 1 oKJLJJrllKlr 

PVHDQLQDPRLLKRMGmVKGGIFLWDPQPSSPQ 
nrvj A i rr 1 OJtJrvjr Jr v v nv^L/JUori v v^v^r AA V V AAL. 
SSHKPPVRGEPPAASPRASTCQSKCDDPEEELCH 
RREARAFSEGEQEKCGSETHHTRRNSRWDKRTL 
EQEDSSSKKRKLL 


3926 


A 


99 


284 


MPREDRATWKSNYFLKIIQLLDDyPKIlFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMMR 


jryL 1 


A 






FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAWACAWWIISLVAVIPMTFLI 
TSTNRTNRSACLDLTSSDELNTIKWYNLBLTA\LL 
CLPLVIVTLCYTTIIHTLTHGHAN\DSCLKQKARR 
LTILLL 


3928 


A 


1 


1516 


GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

MRLSKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKHYGASSCDGCKGFFiRRSVRKNHMYSCRFS 

RQCWDKDKRNQCRYCRLKKCFRAGMKKEAV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSGINGDIRAKKIASIADVCESMKEQLLVLVE 

WAKYIPGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

aUl^ JsJKKJLKo yVyVbLtiiJi 1NUK<^ i UsKOKr Oil 

LLLLLPTLQSITWQMffiQIQFIKLFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPHLMQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASGSEPYKLLPGAVATIVKPLSAIPQPTITKQE 

VI 


3929 


A 


1 


2782 


RVLSLESPLEKDPRVLGAQSVPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEl 

IMGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQIPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPVVPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSILIKHQRTHLREDPFKCPVCG 

KTFTLSATLLRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRIHRGERPYICADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSNLITHVRTHMDENLFVCSD 

CGKAFLEAHELEQHRVIHERGKTPARRAQGDSL 

QHQRIHIGENPYKNADGLIAHAAPKPPQLRSPRL 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRIGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPBLGKSSSVLL 
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SEQID 
NO: 


Method 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqnence 


Amino acid sequence (A=AIanine C=Cysteine, I>=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleudne, K=Lysine, Lr^Leudne, M^Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vallne, W=Tryptophan, Y=Tyrosine, 
X=llnknavvn, *=Stop codon, /^possible nndeotide deletion, 
\==possible nndeotide insertion 










EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 
TQEKPPNPEDPPPEAVTLSTDQEGEGETPTPTESS 
SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 


3930 


A 


513 


273 


KTQETHIYISEHIFFPFLQGFGNLPICMAKTDLSLS 
HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 
SRESPLWL 


3931 


A 


16 - 


305 


KRJRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3932 


A 


16 


305 


KRRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 


3933 


A 


1 


1546 


STHASEHWDSALQLAKHLAPDQIPFISKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVNQALKHPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVYIRSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAVV 

AYENAKQWQSVBRIYLDHLNNPEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLAQQHNKMEIYADnGSEDTTNEDYQSIALY 

FEGEKRYLQAGKFFLLCGQYSRALKHFLKCPSSE 

DNVAIEMAIETVGQAKDELLTNQLIDHLLGEND 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLNDLHSYILVKIHVKNGDHMKGARMLIRVANN 

ISKFPSHIVPILTSTVIECHRAGLKNSAFSFAAML 

MRPEYRSKIDAKYKKKffiGMVRllPDISErEEATTP 

CPFCKFLLPESELL 


3934 


A 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 
GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 
CQTQDRDALRLTLEQBDLIRRMCASYSELELVTS 
AKALNDTQKLACLIGVEGGHSLDNSLSILRtFYM 

JLr\j V x\. X xi-< 1 XX X \_ri^ xir vv ^vctOO/ilXvvj V nijr i iyi.y±ij\jx^ 

TDFGEKWAEMNRLGMMVDLSHVSDAVARRAL 

EVSQAPVBFSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAVMGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 


3935 


A 


1 


883 


HETTPAWQSVLLERGWNKFDKQEQNAEDWNL 

YWRTSSFRMTEHNSVKPWQQLNHHPGTTKLTR 

KDCLAKHLKHMRRMYGTSLYQFIPLTFVMPNDY 

TKFVAEYFOEROMLGTKHSYWICKPAELSRGRG 

BLIFSDFKDFIFDDMYIVQKYISNPLLIGRYKCDLR 

lYVCVTGFKPLTIYVYQEGLVRFATEKFDLSNLQ 

NNYAHLTNSSINKSGASYEKDCEVIGHGCKWTLS 

RFFSYLRSWDVDDLLLWKKIHRMVILTILAIAPS 

VPFAANCFELFGFDILIDDNEFHRTG 


3936 


A 


203 


441 


HLAHSLGPLPKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESILEGI 
KQSKQENLDIDV 
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Position of end of 
digDai in Ammo aciij 


A/TavC /TVyfA YTA/fTTTVyr 

iViaXo (^IVlAAJiVlUlVl 


IMeanS (Mean Score) 


1 

1 


10 

1 


fl o^n 


ft ^Rft 

U.OoU 


0 




n 064 


ft 51*^"^ 


•5 

J 


91 


n 000 


ft oni 


H 


1Q 


n 0R1 


ft QAI 


J 


99 


n 001 


ft 09 ft 




91 


n os^ 


ft R4'? 


g 


99 


\j.y Lj 


ft 71R 


Q 

y 


17 




ft QAQ 


1 1 


1Q 


ft Q'^n 


ft Aftft 
U.OoU 


1 J 


JO 


v.yoj- 


ft QAT 


14 


9S 


ft 0*^^ 


ft QTO 


1 s 


91 


ft 007 


ft 0^^ 


1 V 


l^i 


ft QRl 


ft O/t/1 


17 

1 / 


1 JJ 


ft OQO 


ft Qft/I 


1Q 




ft OOA 


ft T1Q 
U. / iy 




9« 


ft 070 


Kj.y^X) 


91 




ft 0^4 


ft Oft^ 


99 




ft 0^^ 


ft ^/:ft 


9"? 




ft O/IO 


U.oj4 


94 


10 


ft 070 
V.y ly 


A Q/1 1 


9^ 


^4 


ft fiQ/1 




9fi 


J J 


ft C^/l 


u.De4 


97 


17 


ft 07^ 




95t 


1 ft 


ft OBft 

v.yoK) 


A O0/1 


90 


90 


ft OOC 


A T 1 0 
U./ 15 


J V 


zo 


ft QTft 


A CC^ 
U.OOJ 


19 


9n 


ft 


A TIO 


J J 


90 


ft Q'^T 


ft 

u.O / 1 






ft QO^ 


A OOA 


JU 


9/1 


ft Oft^ 


ft ^"70 


40 


10 


ft 091 


ft 0/19 


47 


9^ 


ft 071 


ft OftO 

u.yuy 




99 


ft 001 


A OOQ 




94 


ft OAft 


A CAC 




10 

X? 


ft oc/i 


u.yo/ 


7R 


99 


ft 01 '3 




86 


9n 

Zu 


ft ftff^ 


A <« 


87 


94 


ft 0527 


ft CQO 

u.ooy 


88 


17 


ft 007 


ft OAO 


115 


10 

X-' 


ft O^ft 


ft ACft 
U.OoU 


134 




ft OR"^ 


ft RAT 


136 


17 


ft 01^ 


ft AOA 

u.oyo 


137 


10 


ft OSR 


ft Qft^ 


140 


9X 
xo 


ft 0"^^ 


ft R'^O 






ft 014 


ft 7y*ft 
U. /4U 


1 J J 


91 


ft 007 


ft 


154 


9S 


ftOll 


ft 


155 


29 


0.972 


0.857 


169 


30 


0.977 


0.817 


170 


30 


0,977 


0.819 


171 


30 


0.977 


0.819 


175 


47 


0.926 


0.606 


176 


30 


0.968 


0.872 


177 


22 


0.957 


0.791 


192 


43 


0.930 


0.678 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


MeanS (Mean Score) 


195 


19 


0.956 


0.860 


202 


.21 


0.982 


0.871 


203 


24 


0.957 


0.870 


207 


23 


0.954 


0.905 


224 


46 


0.955 


0.568 


225 


26 


0.942 


0.654 


228 


45 


0.961 


0.839 


231 


28 


0.994 


0.937 


232 


28 


0.993 


0.896 


234 


19 


0.979 


0.942 


235 


19 


0.979 


0.941 


238 


20 


0.987 


0.943 


244 


23 


0.929 


0.683 


250 


34 


0.884 


0.565 


256 


33 


0.934 


0.584 


258 


25 


0.934 


0.729 


259 


22 


0.969 


0.871 


264 


19 


0.952 


0.753 


265 


17 


0.975 


0.914 


266 


17 


0.975 


0.914 


271 


23 


0.974 


0.884 


274 


13 


0.971 


0.834 


275 


18 


0.980 


0.934 


278 


32 


0.958 


0.668 


280 


24 


0.966 


0.881 


281 


24 


0.966 


0.881 


286 


23 


0.928 


0.718 


291 


35 


0.991 


0.824 


293 


27 


0.956 


0.806 


294 


23 


0.952 


0.827 


301 


26 


0.978 


0.885 


316 


20 


0.946 


0.719 


320 


28 


0.978 


0.726 


327 


29 


0.933 


0.671 


331 


48 


0.903 


0.571 


345 


25 


0.996 


0.920 


349 


26 


0;903 


0.579 


351 


24 


0.951 


0.876 


352 


18 


0.944 


0.716 


353 


32 


0.992 


0.854 


354 


27 


0.945 


0.817 


355 


16 


0.922 


0.716 


356 


13 


0.959 


0.818 


357 


23 


0.986 


0.878 


358, 


19 


0.904 


0.671 


359 


16 


0.988 


0.951 


360 


15 


0.981 


0.938 


361 


18 


0.944 


0.716 


362 


21 


0.984 


0.869 


363 


40 


0.979 


0.813 


364 


18 


0.883 


0.693 


365 


22 


0.962 


0.908 . 


366 


22 


0.961 


0.827 


367 


44 


0.941 


0.624 


368 


20 


0.952 


0.791 


369 


22 


0.949 


0.840 


370 


28 


0.957 


0.682 
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SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (MAXIMUM 
SCORE) 


Means (Mean Score) 


372 


28 


0.974 


0.894 


373 


19 


0.972 


0.947 


374 


29 


0.968 


0.785 


375 


19 


0.949 


0.897 


377 


23 


0.962 


0.910 


378 


31 


0.974 


0.895 


379 


26 


0.969 


0.939 


380 


27 


0.945 


0.817 


383 


27 


0.945 


0.817 


384 


25 


0.992 


0.877 


385 


32 


0.983 


0.825 


386 


44 


0.924 


0.564 


387' 


26 


0.971 


0.894 


388 


19 


0.989 


0.862 


389 


24 


0.990 


0.947 


390 


34 


0.942 


0.635 


391 


16 


0.922 


0.716 


394 


19 


0.987 


0.970 


398 


36 


0.992 


0.866 


404 


13 


0.959 


0.818 


417 


23 


0.986 


0.878 


421 


19 


0.904 


0.671 


425 


28 


0.971 


0.717 


431 


16 


0.988 


0.951 


452 


18 


0.944 


0.716 


459 


21 


0.991 


0.902 


468 


21 


0.984 


0.869 


478 


40 


0.979 


0.813 


486 


18 


0.883 


0.693 


499 


22 


0.962 


0.908 


501 


19 


0.962 


0.877 


514 


44 


0.941 


0.624 


529 


20 


0.952 


0.791 


533 


39 


0.914 


0.719 


548 


28 


0.957 


0.682 


561 


28 


0.974 


0.894 


562 


28 


0.974 


0.893 


564 


18 


0.949 


0.806 


576 


19 


0.972 


0.947 


584 


29 


0.968 


0.785 


585 


28 


0.973 


0.810 


591 


19 


0.949 


0.897 


592 


24 


0.991 


0.954 


594 


20 


0.985 


0.959 


595 


20 


0.985 


0.959 


612 


23 


0.962 


0.910 


619 


31 


0.974 


0.895 


621 


15 


0.959 


0.795 


633 


26 


0.969 


0.939 


640 


20 


0.949 


0.842 


645 


25 


0.911 


0.759 


684 


25 


0.992 


0.877 


691 


32 


0.983 


0.825 


698 


44 


0.924 


0.564 


700 


19 


0.982 


0.941 


710 


26 


0.971 


0.894 


714 


23 


0.965 


0.907 
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SEO ID NO- 


l^nciHnn nf pnil nf 

Signal in Amino Acid 
Sequence' 


MaxS nviAXIMUM 
SCORE) 


IVTesnS /IXfean ^rnrp^ 


718 


19 


0.989 


0.862 


725 


21 


0.976 


0.851 


728 


33 


0.961 


0.895 


734 


25 


0.963 


0.660 


741 


34 


0:942 


0.635 


744 


19 


0.959 


0.924 


747 


16 


0.922 


0.716 


756 


26 


0.973 


0.864 


767 


22 


0.986 


0.943 


768 


27 


0.916 


0.758 


769 


19 


0.987 


0.970 


770' 


22 . 


0.981 


0.933 


771 


34 


0.993 


0.893 


773 


20 


0.968 


0.939 


774 


21 


0.971 


0.945 


778 


22 


0.986 


0.943 


779 


32 


0.973 


0.846 


781 


23 


0.950 


0.857 


785 


27 


0.916 


0.758- 


786 


27 


0.916 


0.758 


788 


22 


0.981 


0.933 


793 


22 


0.986 


0.803 


794 


39 


0.892 


0.654 


797 


27 


0.965 


0.847 


810 


22 


0.981 


0.933 


823 


34 


0.993 


0.893 


825 


17 


0.962 


0.778 


837 


20 


0.968 


0.939 


844 


25 


0.984 


0.951 


845 


17 


0.919 


0.706 


846 


21 


0.971 


0.945 


847 


21 


0.971 


0.945 


890 


22 


0.986 


0.943 


893 


24 


0.971 


0.865 


894 


24 


0:971 


0.865 


896 


32 


0.973 


0.846 


899 


31 


0.982 


0.817 


922 


15 


0.882 


0.7O6 


924 


21 


0.975 


0.948 


925 


21 


0,927 


0.661 


933 


20 


0.967 


0.9O6 


960 


20 


0.967 


0.906 


967 


38 


0.970 


0.784 


968 


47 


0.970 


0.557. 


972 


36 


0.945 


0.775 



TABLES 



SEQ 

ED 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
add residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, ^^Phenylalanine, G=Glycine, 
H=Histidlne, I^Isoleucine, K=Lysine, L=Leucine, 
IM=Metbionine, N=Asparagine, P=Proline, Q=Glutamine, 
R=Arginine, S=Serine, T=Threonine, V=VaIine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion, \=possible nucleotide 
Insertion 


3955 


A 


235 


1272 


GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAWhfPTRWHLPAQPEMLYEGGEGRMETLK 
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SEQ 

ID 

NO: 


Method 


Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic 
Acid, £=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Wsoleucine, K=Lysine, lr=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q^Glutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, \=possible nucleotide 
insertion 










DKTLQELEELQM)SEAIDQLALESPEVQDLQLERE 
MALATNRSLAERNLEFQGPLEISRSNLSDRYQELR 
KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

j\jlljlllojiAlVLA.iiJsj' LiCKjii V rL«Jb 1 r LriJNr ooJNlKJVlLoH 

LRRVRVEKLQEVVRKPRASQELAGDAPPPRSPPP 
V/PPSPPGNTPCG*RAAAATISHASLPFALQPIPQPA 
CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 
AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

V 


3956 


A :.. . 


821 


385 


SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 

SNHAGIQRSSRP/SHYQGEAVHDNCFTADELQLLT 

YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 

VDKEHDSAEGSHVSGQSNGRDPQALAKAVQfflQ 

DTLRTMYFA 


3957 


A 


4621 


240 


ELISTFKLLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIMIE 

KESVEVAKTEIdVKADETIANEQAMASKAIKDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKLVMEAICILIiGIKADKIPDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMNIIR 

KNYIPNPDFVPEKIRNASTAAEGLCKWVIAMDSY 

DKVAKIVAPKKIKLAAAEGELKIAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 

LTGDILISSGWAYLGAFTSTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVTIRTWNIAGLPSDSF 

SIDNGmMNARRWPLMIDPQSQANKWIKNMEKA 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSTIEYAPDFR 

FYITTKLRNPHYLPETSVKVTLLNFMITPEGMQDQ 

LLGIWAQERPDLEEEKQALDLQGAENKRQLKEIE 

DKILEVLSSSEGNILEDETAIKILSSSKALANEISQK 

QEVAEETEKKIDTTRMGYRPIAIHSSILFFSLADLA 

NffiPMYQYSLTWFINLFILSIENSEKSEILAKRLQIL 

KDHFTYSLYVNVCRSLFEKDKLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQBCS 

WDEICRLDDLPAFKTIRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVIPM 

LQEFIINRLGRAFIEPPPFDLAKAFGDSNCCAPLIFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPIAMKMLEKAVKEGTWVVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANIIRSYLMDPISDPEFFGSC 

KKPEEFKKLLYGLCFFHALVQERRKFGPLWWNIP 

YEFNETDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNYGGRVTDDWDRRTLRSILNKFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEWNEVASDILGKLPNNFDIEAAMRRYPT 

IKGLAVMSTDLEEVVSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKFLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTEPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKIKKLAESH 

PKILYDTVPVMWLKPCKRADIPKRPSYVAPLYKT 
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SEQ 

ID 

NO: 


Method 


Predicted' 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=Glutamine, 
K=Arginine, a=oenne, l=l nreonme, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, V=possible nucleotide 
insertion 










SERRGVLSTTGHSTNFVIA\MTLPSDQPKEHWIGR 

n\/ ATT r^/~\T XTC 


3958 


A 


35 


529 


GADMAKSKNHTTHNQSRKWHRNVIKKPLSQRYK 

SLKGVDPKFLG>nyiCFTKKHKKKGLKKMQADSA 

KAVSTCAKAIEALVKPKEVKPKIPKGVSCELN*LA 

YIAYPKFWTCACACIAKGLRLCQPKAKAQDQTK 

AQVQKAQAAAPASVPTQAPKGAQAPTKASG 


3959 


A 


1883 


763 


LLVLLLRTNLLIASSTRISRATLTCSPPGIPVDPRVR 
PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 
QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 
ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 
PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 
PQIIKEVl^WNSILELPCPHLSALASYYWSHGPAA 
VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 
SYPVISYWVDSQDQTLALDPELAGBPREHVKVPLT 
RVSGGAAIjIlAQQSYWPHFVTVTVLFALVLSGALI 
ELVASPLRALRARGKVQGCETLRPGEKAPLSREQH 


3960 


A 


1 


481 


SYAAPSLFVKSLYWALAFMAVLLAVSGVVIWLA 

SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 

SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 

AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 

ALEEGTLVAANCSTPRPWVCAKGTQ 



' TABLE 9 



SEQ ID NO: 


Accession 
Number 


Species 


Description 


Smith 

Waterman 

Score 


% Idenity 


3937 


Y27700 


Homo sapiens 


Human secreted 
protein encoded by 
gene No. 12. 


193 


25. 


3938 


AF093097 . 


Homo sapiens 


putaitive RNA-binding 
protein Q99 


3881 


84 


3939 


AB012308 


Anthocidaris 
crassispina 


B2HC 


4169 


74 


3940 


U10248 


Homo sapiens 


ribosomal protein L29 


787- 


95 


3941 


Y99418 


Homo sapiens 


Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 


4031 


100 


3942 


AL023516 


Gallus gallus 


B locus C type Lectin 


198 


35 



TABLE 10 



SEQ ID 
NO: 


Accession No. 


Description 


Results* 


3937 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.168e-l 1 209- 
224 


3942 


BL00615 


C-type lectin domain proteins. 


BL00615A 16.68 6.400e-ll 37- 
55 



* Results Include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 1 1 



SEQID 
NO: 


PFAM Name 


Description 


P-Value 


PFAM 
Score 


3938 


Piwi 


Piwi domain 


2.6e-150 


512.7 


3940 


Ribosomal L29e 


Ribosomal L29e protein family 


2.3e-19 


77.8 


3941 


Sema 


Sema domain 


4e-181 


615.1 


3942 


lectin c 


Lectin C-type domain 


0.086 


-7.1 



TABLE 12 





SEQ ID NO: 


Position of end of 
Signal in Amino Acid 
Sequence 


MaxS (Maximum Score) 


Means (Mean Score) 




3941 


31 


0.985 


0.926 - 




3942 


21 


0.974 


0.894 


10 


TABLE 13 



SEQ ID NO: 
of full length 
nucleotide 
sequence 


SEQID 
NO: of fuU 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID NO: 
of contig 
peptide 
sequence 


Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 


SEQ ID NO: in 
USSN 09/496,914 


3937 


3943 


3949 


3955 


787CIP2G 1 


787 3587 


3938 


3944 


3950 


3956 


787CIP2G 2 


787 3813 


3939 


3945 


3951 


3957 


787CIP2G 3 


787 4462 


3940 


3946 


3952 


3958 


787CIP2G 4 


787 4887 


3941 


3947 


3953 


3959 


787CIP2G 5 


787 5794 


3942 


3948 


3954 


3960 


787CIP2G 6 


787 8743 



TABLE 14 



TISSUE ORIGIN 


LIBRARY/ 


HYSEQ LIBRARY 


SEQIDNOS: 




RNA SOURCE 


NAME 




adult brain 


GIBCO 


ABD003 


3940 


adult brain 


Clontech 


ABR006 


3940 


adult brain 


Invitrogen 


ABR014 


3940 


cultured preadipocytes 


Strategene 


ADPOOl 


3937 


adult heart 


GIBCO 


AHROOl 


3940 


adult kidney 


GIBCO 


AKDOOl 


3940 


adult lung 


GIBCO 


ALGOOl 


3940 


young liver 


GIBCO 


ALVOOl 


3940 


adult ovary 


Invitrogen 


AOVOOl 


3938, 3940-3941 


adult spleen 


GIBCO 


ASPOOl 


3940-3941 


testis 


GIBCO 


ATSOOl 


3940 


bone marrow 


Clontech 


BMDOOl 


3938, 3940 


bone marrow 


Clontech 


BMD004 


3940 


adult cervix 


BioChain 


CVXOOl 


3940 


endothelial cells 


Strategene 


EDTOOl 


3940 


fetal brain 


Clontech 


FBR006 


3940 


fetal brain 


Invitrogen 


FBTX)02 


3940-3941 


fetal heart 


Invitrogen 


FHROOl 


3940 


fetal kidney 


Clontech 


FKDOOl 


3940 


fetal kidney 


Clontech 


FKD002 


3940 
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T.TRRARY/ 
RNA SOURCE 


NAME 


ojcj^ mjj 


fetal liver-^n!f»pn 


\^UllUli U la 

TJniversitv 


FT <\nni 


jyj if jy^\) 


fetal liver-spleen 


\mf\JX Ull 1 UiO. 

University 


FT son? 


3038 3041 


fetal liver-spleen 


Columbia 
University 


FLS003 


3940 


fetal liver 


Clontech 


FLV004 


3940 


fetal skin 


Invitrogen 


FSKOOl 


3940-3942 


fetal spleen 


BioChsin 


FSPOOl 


3940 


fetal brain 


GIBCO 


HFBOOl 


3937, 3940-3941 


infant brain 


TJniversttv 


IB2002 


■SQ'IT aq^i 


leukocyte 


GIBCO 


LUCOOl 


3040-394! 


leukocyte 


Clontech 


LUC003 


3040-3041 


melanoma from cell line ATCC 
#CRL 1424 


Clontpcli 


MFT 004 


304(1 


mammary gland 


Invitrogen 


MMGOOl 


3937 3040-3041 


nexironal cells 


Strategene 


NTUOOl 


3937, 3942 


prostate 


Clontech 


PRTOOl 


3938 


rectum 


Invitrogen 


RECOOl 


3940 


salivary gland 


Clontech 


SALs03 


3941 


small intestine 


Clontech 


SINOOl 


3940 


skeletal muscle 


Clontech 


SKMOOl 


3940 


spinal cord 


Clontech 


SPCOOl 


3940 


thymus 


Clontech 


THMc02 


3938 


thyroid gland 


Clontech 


THROOl 


3942 


uterus 


Clontech 


UTROOl 


3940 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the poljmiucleotide of claim 1 . 

7. An expressionvector comprising the polynucleotide of claim 1. 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 10 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the poljmucleotide o^ claim 1 in a sample, comprising: 

a) contacting the sample with a compoimd that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. . 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

1 5. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
fiuther comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compoimd that binds to and forms a complex 
with the polypeptide imder conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compovmd complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984,- 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects fiill-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

211 A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 1 0 or 20 and a pharmaceutically acceptable carrier. 
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